Nucleic acid encoding 5&#39;-3&#39; exonuclease of bacteriophage RM 378

ABSTRACT

A novel bacteriophage RM 378 of  Rhodothermus marinus , the nucleic acids of its genome, nucleic acids comprising nucleotide sequences of open reading frames (ORFs) of its genome, and polypeptides encoded by the nucleic acids, are described.

RELATED APPLICATIONS

[0001] This application is a Divisional of U.S. application Ser. No.:09/585,858, filed Jun. 1, 2000, which claims the benefit of U.S.Provisional Application No. 60/137,120, filed Jun. 2, 1999, the entireteachings of which are incorporated herein by reference. Five separatedivisional applications are being filed concurrently herewith.

BACKGROUND OF THE INVENTION

[0002] The use of thermophilic enzymes has revolutionized the field ofrecombinant DNA technology. Polymerases (DNA and RNA), ligases,exonucleases, reverse transcriptases, polynucleotide kinases andlysozymes, as well as many other thermophilic enzymes, are of greatimportance in the research industry today. In addition, thermophilicenzymes are also used in commercial settings (e.g., proteases andlipases used in washing powder, hydrolidic enzymes used in bleaching).Identification of new thermophilic enzymes will facilitate continued DNAresearch as well as assist in improving commercial enzyme-basedproducts.

SUMMARY OF THE INVENTION

[0003] This invention pertains to a novel bacteriophage of Rhodothermusmarinus, bacteriophage RM 378, which can be isolated from its nativeenvironment or can be recombinantly produced. The invention additionallypertains to the nucleic acids of the genome of bacteriophage RM 378 asdeposited, as well as to the nucleic acids of a portion of the genome ofbacteriophage RM 378 as shown in FIG. 1; to isolated nucleic acidmolecules containing a nucleotide sequence of an open reading frame (ormore than one open reading frame) of the genome of bacteriophage RM 378,such as an open reading frame as set forth in FIG. 2; to isolatednucleic acid molecules encoding a polypeptide obtainable frombacteriophage RM 378 or an active derivative or fragment of thepolypeptide (e.g., a DNA polymerase, such as a DNA polymerase lackingexonuclease domains; a 3′-5′ exonuclease, such as a 3′-5′ exonucleaselacking DNA polymerase domain; a 5′-3′ exonuclease (RNase H); a DNAhelicase; or an RNA ligase); to DNA constructs containing the isolatednucleic acid molecule operatively linked to a regulatory sequence; andalso to host cells comprising the DNA constructs. The invention furtherpertains to isolated polypeptides encoded by these nucleic acids, aswell as active derivatives or fragments of the polypeptides.

[0004] Because the host organism of the RM 378 bacteriophage is athermophile, the enzymes and proteins of the RM 378 bacteriophage areexpected to be significantly more thermostable than those of other(e.g., mesophilic) bacteriophages, such as the T4 bacteriophage ofEscherichia coli. The enhanced stability of the enzymes and proteins ofRM 378 bacteriophage allows their use under temperature conditions whichwould be prohibitive for other enzymes, thereby increasing the range ofconditions which can be employed not only in DNA research but also incommercial settings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] FIGS. 1A-1Q2 are a depiction of the nucleic acid sequence (SEQ IDNO: 1) of the genome of bacteriophage RM 378.

[0006] FIGS. 2A-2C delineate the open reading frames (ORFs) in thegenome of bacteriophage RM 378.

[0007] FIGS. 3A-3W depict a sequence alignment of the predicted geneproducts of ORF056e and ORF632e and sequences of DNA polymerases offamily B. The sequence marked RM378 (SEQ ID NO:36) is the combinedsequences of the gene products of ORF056e and ORF632e in bacteriophageRM378. The end of one sequence and the beginning of another isindicated. Other sequences are: Vaccinia virus (strain Copenhagen) DNApolymerase (DPOL_VACCC) (SEQ ID NO:2); Vaccinia virus (strain WR) DNApolymerase (DPOL_VACCV) (SEQ ID NO:3); Variola virus DNA polymerase(DPOL_VARV) (SEQ ID NO:4); Fowlpox virus DNA polymerase (DPOL_FOWPV)(SEQ ID NO:5); Bos taurus (Bovine) DNA polymerase delta catalytic chain(DPOD_BOVIN) (SEQ ID NO:6); Human DNA polymerase delta catalytic chain(DPOD_HUMAN) (SEQ ID NO:7); Candida albicans (Yeast) DNA polymerasedelta large chain (DPOD_CANAL) (SEQ ID NO:8); Saccharomyces cerevisiaeDNA polymerase delta large chain (DPOD_YEAST) (SEQ ID NO:9);Schizosaccharomyces pombe DNA polymerase delta large chain (DPOD_SCHPO)(SEQ ID NO:10); Plasmodium falciparum DNA polymerase delta catalyticchain (DPOD_PLAFK) (SEQ ID NO: 11); Chlorella virus NY-2A DNA polymerase(DPOL_CHVN2) (SEQ ID NO: 12); Paramecium bursaria chlorella virus 1 DNApolymerase (DPOL_CHVP1) (SEQ ID NO:13); Epstein-barr virus (strainB95-8) DNA polymerase (DPOL_EBV) (SEQ ID NO: 14); Herpesvirus saimiri(strain 11) DNA polymerase (DPOL_HSVSA) (SEQ ID NO:15); Herpes simplexvirus (type 1/strain 17) DNA polymerase (DPOL_HSV11) (SEQ ID NO: 16);Herpes simplex virus (type 2/strain 186) DNA polymerase (DPOL_HSV21)(SEQ ID NO:17); Equine herpesvirus type 1 (strain Ab4p) (EHV-1) DNApolymerase (DPOL_HSVEB) (SEQ ID NO: 18); Varicella-zoster virus (strainDumas) (VZV) DNA polymerase (DPOL_VZVD) (SEQ ID NO: 19); Humancytomegalovirus (strain AD169) DNA polymerase (DPOL_HCMVA) (SEQ IDNO:20); Murine cytomegalovirus (strain Smith) DNA polymerase(DPOL_MCMVS) (SEQ ID NO:21); Herpes simplex virus (type 6/strainUganda-1102) DNA polymerase (DPOL_HSV6U) (SEQ ID NO:22); Human DNApolymerase alpha catalytic subunit (DPOA_HUMAN) (SEQ ID NO:23); MouseDNA polymerase alpha catalytic subunit (DPOA_MOUSE) (SEQ ID NO:24);Drosophila melanogaster DNA polymerase alpha catalytic subunit(DPOA_DROME) (SEQ ID NO:25); Schizosaccharomyces pombe DNA polymerasealpha catalytic subunit (DPOA_SCHPO) (SEQ ID NO:26); Saccharomycescerevisiae DNA polymerase alpha catalytic subunit (DPOA_YEAST) (SEQ IDNO:27); Trypanosoma brucei DNA polymerase alpha catalytic subunit(DPOA_TRYBB) (SEQ ID NO:28); Autographa californica nuclear polyhedrosisvirus DNA polymerase (DPOL_NPVAC) (SEQ ID NO:29); Lymantria disparmulticapsid nuclear polyhedrosis virus DNA polymerase (DPOL_NPVLD) (SEQID NO:30); Saccharomyces cerevisiae DNA polymerase zeta catalyticsubunit (DPOZ_YEAST) (SEQ ID NO:31); Pyrococcus woesei DNA polymerase(DPOL_PYRFU) (SEQ ID NO:32);. Sulfolobus solfataricus DNA polymerase I(DPO1_SULSO) (SEQ ID NO:33); Escherichia coli DNA polymerase II(DPO2_ECOLI) (SEQ ID NO:34); Desilforococcus strain Tok DNA polymerase(Dpol_Dtok) (SEQ ID NO:35); and bacteriophage RB69 DNA polymerase (RB69)(SEQ ID NO:37). Most of the sequences are partial as found in theProtein Families Data Base of Alignments and HMMs (Sanger Institute),family DNA_pol_B, accession no. PF00136.

[0008]FIG. 4 depicts a sequence alignment of the predicted gene productof ORF739f from bacteriophage RM378 (ORF-739f) (SEQ ID NO:40),Autographa californica nucleopolyhedrovirus putative bifunctionalpolynucleotide kinase and RNA ligase (ACNV-RNAlig) (SEQ ID NO:38); andbacteriophage T4 RNA ligase (T4-RNAlig) (SEQ ID NO:39).

[0009]FIG. 5 depicts a sequence alignment of the predicted gene productof ORF1218a from bacteriophage RM378 (ORF-1218a) (SEQ ID NO:43) withproteins or domains with 5′-3′ exonuclease activity, including:Escherichia coli DNA polymerase I (Ecoli-polI) (SEQ ID NO:41), Thermusaquaticus DNA polymerase I (Taq-polI) (SEQ ID NO:42), bacteriophage T4ribonuclease H (T4-RNaseH) (SEQ ID NO:44) and bacteriophage T7 gene6exonuclease (T7-gp6exo) (SEQ ID NO:45). Conservation of acidic residuesmainly clustered at the proposed active site are seen.

[0010] FIGS. 6A-6B depict a sequence alignment of the predicted geneproduct of ORF1293b (SEQ ID NO:55) from bacteriophage RM378 (ORF1293b)with sequences of replicative DNA helicases of the DnaB family,including: Escherichia coli (DnaB-Ecoli) (SEQ ID NO:46), Haemophilusinfluenza (DnaB-Hinflu) (SEQ ID NO:47), Chlamydomonas trachomatis(DnaB-Ctracho) (SEQ ID NO:48), Bacillus stearothermophilus(DnaB-Bstearo) (SEQ ID NO:49), Halobacter pylori (DnaB-Hpylor) (SEQ IDNO:50), Mycoplasma genitalium (DnaB-Mgenital) (SEQ ID NO:51), Borreliaburgdorferi (DnaB-Bburgdor) (SEQ ID NO:52), bacteriophage T4 gene 41(T4-gp41) (SEQ ID NO:53), bacteriophage T7 gene 4 (T7-gp4) (SEQ IDNO:54) (from the Protein Families Data Base of Alignments and HMMS(Sanger Institute), family DnaB, accession no. PF00772). The sequenceshave been truncated at the N-termini, and conserved sequence motifs areindicated.

[0011] FIGS. 7A-7B depict the nucleic acid sequence of open readingframe ORF 056e (nucleotides 21993-23042 of the genome) (SEQ ID NO:56)with flanking sequences, and the putative encoded polypeptide (SEQ IDNO:57) which displays amino acid sequence similarity to polymerase 3′-5′exonucleases.

[0012] FIGS. 8A-8B depict the nucleic acid sequence of open readingframe ORF 632e (nucleotides 79584-81152 of the genome) (SEQ ID NO:58)with flanking sequences, and the putative encoded polypeptide (SEQ IDNO:59) which displays amino acid sequence similarity to polymerases.

[0013] FIGS. 9A-9B depict the nucleic acid sequence of open readingframe ORF 739f (nucleotides 90291-91607 of the genome) (SEQ ID NO:60)with flanking sequences, and the putative encoded polypeptide (SEQ IDNO:40) which displays amino acid sequence similarity to RNA ligase.

[0014] FIGS. 10A-10B depict the nucleic acid sequence of open readingframe ORF 1218a (nucleotides 8212-9168 of the genome) (SEQ ID NO:61)with flanking sequences, and the putative encoded polypeptide (SEQ IDNO:43) which displays amino acid sequence similarity to 5′-3′exonuclease of DNA polymerase I and T4 RNase H.

[0015] FIGS. 11A-11B depict the nucleic acid sequence of open readingframe ORF 1293b (nucleotides 15785-17035 of the genome) (SEQ ID NO:62)with flanking sequences, and the putative encoded polypeptide (SEQ IDNO:55) which displays amino acid sequence similarity to T4 DNA helicase.

DETAILED DESCRIPTION OF THE INVENTION

[0016] The present invention relates to a bacteriophage, the nucleicacid sequence of the bacteriophage genome as well as portions of thenucleic acid sequence of the bacteriophage genome (e.g., a portioncontaining an open reading frame), and proteins encoded by the nucleicacid sequences, as well as nucleic acid constructs comprising portionsof the nucleic acid sequence of the bacteriophage genome, and host cellscomprising such nucleic acid constructs. As described herein, Applicantshave isolated and characterized a novel bacteriophage active against theslightly halophilic, thermophilic eubacterium Rhodothermus marinus. Thebacteriophage, RM 378, is a member of the Myoviridae family, with an A2morphology. RM 378, which is completely stable up to about 65° C.,appears to consist of approximately 16 proteins with one major proteinof molecular weight of 61,000 daltons. RM 378 can be replicated inRhodothermus marinus species ITI 378.

[0017] RHODOTHERMUS MARINUS ITI 378

[0018] Accordingly, one embodiment of the invention is the bacterium,Rhodothermus marinus species ITI 378. Rhodothermus marinus, andparticularly species ITI 378, can be cultured in a suitable medium, suchas medium 162 for Thermus as described by Degryse et al. (Arch.Microbiol. 117:189-196 (1978)), with {fraction (1/10)} buffer and with1% NaCl. Rhodothermus marinus species ITI 378 can be used in replicationof bacteriophage RM 378, as described herein, or in replication oridentification of other bacteriophages, particularly thermophilicbacteriophages. Rhodothermus marinus species ITI 378 can also used inthe study of the relationship between the bacteriophages and their hostcells (e.g., between bacteriophage RM 378 and Rhodothermus marinusspecies ITI 378).

[0019] BACTERIOPHAGE RM 378

[0020] Another embodiment of the invention is isolated RM 378bacteriophage. “Isolated” RM 378 bacteriophage refers to bacteriophagethat has been separated, partially or totally, from its nativeenvironment (e.g., separated from Rhodothermus marinus host cells)(“native bacteriophage”), and also refers to bacteriophage that has beenchemically synthesized or recombinantly produced (“recombinantbacteriophage”). A bacteriophage that has been “recombinantly produced”refers to a bacteriophage that has been manufactured using recombinantDNA technology, such as by inserting the bacteriophage genome into anappropriate host cell (e.g., by introducing the genome itself into ahost cell, or by incorporating the genome into a vector, which is thenintroduced into the host cell).

[0021] Isolated bacteriophage RM 378 can be used in the study of therelationship between the bacteriophages and their host cells (e.g.,Rhodothermus marinus, such as Rhodothermus marinus species ITI 378).Isolated bacteriophage RM 378 can also be used as a vector to delivernucleic acids to a host cell; that is, the bacteriophage can be modifiedto deliver nucleic acids comprising a gene from an organism other thanthe bacteriophage (a “foreign” gene). For example, nucleic acidsencoding a polypeptide (e.g., an enzyme or pharmaceutical peptide) canbe inserted into the genome of bacteriophage RM 378, using standardtechniques. The resultant modified bacteriophage can be then used toinfect host cells, and the protein encoded by the foreign nucleic acidscan then be produced.

[0022] Bacteriophage RM 378 can be produced by inoculating appropriatehost cells with the bacteriophage. Representative host cells in whichthe bacteriophage can replicate include Rhodothermus marinus,particularly species isolated in a location that is geographicallysimilar to the location where bacteriophage RM 378 was isolated (e.g.,northwest Iceland). In a preferred embodiment, the host cell isRhodothermus marinus species ITI 378. The host cells are cultured in asuitable medium (e.g., medium 162 for Thermus as described by Degryse etal., Arch. Microbiol 117:189-196 (1978), with {fraction (1/10)} bufferand with 1% NaCl). In addition, the host cells are cultured underconditions suitable for replication of the bacteriophage. For example,in a preferred embodiment, the host cells are cultured at a temperatureof at least approximately 50° C. In a more preferred embodiment, thehost cells are cultured at a temperature between about 50° C. and about80° C. The bacteriophage can also be stored in a cell lysate at about 4°C.

[0023] NUCLEIC ACIDS OF THE INVENTION

[0024] Another embodiment of the invention pertains to isolated nucleicacid sequences obtainable from the genome of bacteriophage RM 378. Asdescribed herein, approximately 130 kB of the genome of bacteriophage RM378 have been sequenced. The sequence of this 130 kB is set forth inFIG. 1. There are at least approximately 200 open reading frames (ORFs)in the sequence; of these, at least approximately 120 putatively encodea polypeptide of 100 amino acids in length or longer. These 120 are setforth in FIG. 2. FIG. 2 sets forth the locus of each ORF; the start andstop nucleotides in the sequence of each ORF; the number of nucleotidesin the ORF, and the expected number of amino acids encoded therein; thedirection of the ORF; the identity of the putative protein encodedtherein; the protein identified by a BLAST search as being the closestmatch to the putative protein; the percentage identity at the amino acidlevel of the putative protein (based on partial sequence similarity; theoverall similarity is lower); the organism from which the closestmatching protein is derived; and other information relating to the ORFs.

[0025] The invention thus pertains to isolated nucleic acid sequence ofthe genome (“isolated genomic DNA”) of the bacteriophage RM 378 that hasbeen deposited with the Deutsche Sammlung Von Mikroorganismen undZellkulturen GmbH (DSMZ) as described below. The invention also pertainsto isolated nucleic acid sequence of the genome of bacteriophage RM 378as is shown in FIG. 1 (SEQ ID NO: 1).

[0026] The invention additionally pertains to isolated nucleic acidmolecules comprising the nucleotide sequences of each of the ORFsdescribed above or fragments thereof, as well as nucleic acid moleculescomprising nucleotide sequences of more than one of the ORFs describedabove or fragments of more than one of the ORFs. The nucleic acidmolecules of the invention can be DNA, or can also be RNA, for example,mRNA. DNA molecules can be double-stranded or single-stranded; singlestranded RNA or DNA can be either the coding, or sense, strand or thenon-coding, or antisense, strand. Preferably, the nucleic acid moleculecomprises at least about 100 nucleotides, more preferably at least about150 nucleotides, and even more preferably at least about 200nucleotides. The nucleotide sequence can be only that which encodes atleast a fragment of the amino acid sequence of a polypeptide;alternatively, the nucleotide sequence can include at least a fragmentof a coding sequence along with additional non-coding sequences such asnon-coding 3′ and 5′ sequences (including regulatory sequences, forexample).

[0027] In certain preferred embodiments, the nucleotide sequencecomprises one of the following ORFs: ORF 056e, 632e, 739f, 1218a, 1293b.For example, the nucleotide sequence can consist essentially of one ofthe ORFs and its flanking sequences, such as are shown in FIGS. 7-11(e.g., ORF 056e (SEQ ID NO:56), 632e (SEQ ID NO:58), 739f (SEQ IDNO:60), 1218a (SEQ ID NO:61), 1293b (SEQ ID NO:62)).

[0028] Additionally, the nucleotide sequence(s) can be fused to a markersequence, for example, a sequence which encodes a polypeptide to assistin isolation or purification of the polypeptide. Representativesequences include, but are not limited to, those which encode aglutathione-S-transferase (GST) fusion protein. In one embodiment, thenucleotide sequence contains a single ORF in its entirety (e.g.,encoding a polypeptide, as described below); or contains a nucleotidesequence encoding an active derivative or active fragment of thepolypeptide; or encodes a polypeptide which has substantial sequenceidentity to the polypeptides described herein. In a preferredembodiment, the nucleic acid encodes a polymerase (e.g., DNApolymerase); DNA polymerase accessory protein; dsDNA binding protein;deoxyriboncleotide-3-phosphatase; DNA topoisomerase; DNA helicase; anexonuclease (e.g., 3′-5′ exonuclease, 5′-3′ exonuclease (RNase H)); RNAligase; site-specific RNase inhibitor of protease; endonuclease;exonuclease; mobility nuclease; reverse transcriptase; single-strandedbinding protein; endolysin; lysozyme; helicase;alpha-glucosyltransferase; or thymidine kinase, as described herein. Ina particularly preferred embodiment, the nucleic acid encodes a DNApolymerase, 3′-5′ exonuclease, 5′-3 exonuclease (RNase H), DNA helicaseor RNA ligase. In another particularly preferred embodiment, the nucleicacid encodes a DNA polymerase that lacks exonuclease domains, or a 3′-5′exonuclease that lacks DNA polymerase domain, as described below.

[0029] The nucleic acid molecules of the invention are “isolated;” asused herein, an “isolated” nucleic acid molecule or nucleotide sequenceis intended to mean a nucleic acid molecule or nucleotide sequence whichis not flanked by nucleotide sequences which normally (in nature) flankthe gene or nucleotide sequence (as in genomic sequences) and/or hasbeen completely or partially purified from other transcribed sequences(e.g., as in an RNA library). For example, an isolated nucleic acid ofthe invention may be substantially isolated with respect to the complexcellular milieu in which it naturally occurs. In some instances, theisolated material will form part of a composition (for example, a crudeextract containing other substances), buffer system or reagent mix. Inother circumstance, the material may be purified to essentialhomogeneity, for example as determined by PAGE or column chromatographysuch as HPLC. Thus, an isolated nucleic acid molecule or nucleotidesequence can include a nucleic acid molecule or nucleotide sequencewhich is synthesized chemically or by recombinant means. Therefore,recombinant DNA contained in a vector are included in the definition of“isolated” as used herein. Also, isolated nucleotide sequences includerecombinant DNA molecules in heterologous organisms, as well aspartially or substantially purified DNA molecules in solution. In vivoand in vitro RNA transcripts of the DNA molecules of the presentinvention are also encompassed by “isolated” nucleotide sequences.

[0030] The present invention also pertains to nucleotide sequences whichare not necessarily found in nature but which encode the polypeptidesdescribed below. Thus, DNA molecules which comprise a sequence which isdifferent from the naturally-occurring nucleotide sequence but which,due to the degeneracy of the genetic code, encode the polypeptides ofthe present invention are the subject of this invention. The inventionalso encompasses variations of the nucleotide sequences of theinvention, such as those encoding active fragments or active derivativesof the polypeptides as described below. Such variations can benaturally-occurring, or non-naturally-occurring, such as those inducedby various mutagens and mutagenic processes. Intended variationsinclude, but are not limited to, addition, deletion and substitution ofone or more nucleotides which can result in conservative ornon-conservative amino acid changes, including additions and deletions.Preferably, the nucleotide or amino acid variations are silent orconserved; that is, they do not alter the characteristics or activity ofthe encoded polypeptide.

[0031] The invention described herein also relates to fragments of theisolated nucleic acid molecules described herein. The term “fragment” isintended to encompass a portion of a nucleotide sequence describedherein which is from at least about 25 contiguous nucleotides to atleast about 50 contiguous nucleotides or longer in length; suchfragments are useful as probes and also as primers. Particularlypreferred primers and probes selectively hybridize to the nucleic acidmolecule encoding the polypeptides described herein. For example,fragments which encode polypeptides that retain activity, as describedbelow, are particularly useful.

[0032] The invention also pertains to nucleic acid molecules whichhybridize under high stringency hybridization conditions, such as forselective hybridization, to a nucleotide sequence described herein(e.g., nucleic acid molecules which specifically hybridize to anucleotide sequence encoding polypeptides described herein, and,optionally, have an activity of the polypeptide). Hybridization probesare oligonucleotides which bind in a base-specific manner to acomplementary strand of nucleic acid. Suitable probes includepolypeptide nucleic acids, as described in (Nielsen et al., Science 254,1497-1500 (1991)).

[0033] Such nucleic acid molecules can be detected and/or isolated byspecific hybridization (e.g., under high stringency conditions).“Stringency conditions” for hybridization is a term of art which refersto the incubation and wash conditions, e.g., conditions of temperatureand buffer concentration, which permit hybridization of a particularnucleic acid to a second nucleic acid; the first nucleic acid may beperfectly (i.e., 100%) complementary to the second, or the first andsecond may share some degree of complementarity which is less thanperfect (e.g., 60%, 75%, 85%, 95%). For example, certain high stringencyconditions can be used which distinguish perfectly complementary nucleicacids from those of less complementarity.

[0034] “High stringency conditions”, “moderate stringency conditions”and “low stringency conditions” for nucleic acid hybridizations areexplained on pages 2.10.1-2.10.16 and pages 6.3.1-6 in Current Protocolsin Molecular Biology (Ausubel, F. M. et al., “Current Protocols inMolecular Biology”, John Wiley & Sons, (1998)) the teachings of whichare hereby incorporated by reference. The exact conditions whichdetermine the stringency of hybridization depend not only on ionicstrength (e.g., 0.2×SSC, 0.1×SSC), temperature (e.g., room temperature,42° C., 68° C.) and the concentration of destabilizing agents such asformamide or denaturing agents such as SDS, but also on factors such asthe length of the nucleic acid sequence, base composition, percentmismatch between hybridizing sequences and the frequency of occurrenceof subsets of that sequence within other non-identical sequences. Thus,high, moderate or low stringency conditions can be determinedempirically.

[0035] By varying hybridization conditions from a level of stringency atwhich no hybridization occurs to a level at which hybridization is firstobserved, conditions which will allow a given sequence to hybridize(e.g., selectively) with the most similar sequences in the sample can bedetermined.

[0036] Exemplary conditions are described in Krause, M. H. and S. A.Aaronson, Methods in Enzymology, 200:546-556 (1991). Also, in, Ausubel,et al., “Current Protocols in Molecular Biology”, John Wiley & Sons,(1998), which describes the determination of washing conditions formoderate or low stringency conditions. Washing is the step in whichconditions are usually set so as to determine a minimum level ofcomplementarity of the hybrids. Generally, starting from the lowesttemperature at which only homologous hybridization occurs, each ° C. bywhich the final wash temperature is reduced (holding SSC concentrationconstant) allows an increase by 1% in the maximum extent of mismatchingamong the sequences that hybridize. Generally, doubling theconcentration of SSC results in an increase in T_(m) of ˜17° C. Usingthese guidelines, the washing temperature can be determined empiricallyfor high, moderate or low stringency, depending on the level of mismatchsought.

[0037] For example, a low stringency wash can comprise washing in asolution containing 0.2×SSC/0.1% SDS for 10 min at room temperature; amoderate stringency wash can comprise washing in a prewarmed solution(42° C.) solution containing 0.2×SSC/0.1% SDS for 15 min at 42° C.; anda high stringency wash can comprise washing in prewarmed (68° C.)solution containing 0.1×SSC/0.1%SDS for 15 min at 68° C. Furthermore,washes can be performed repeatedly or sequentially to obtain a desiredresult as known in the art.

[0038] Equivalent conditions can be determined by varying one or more ofthe parameters given as an example, as known in the art, whilemaintaining a similar degree of identity or similarity between thetarget nucleic acid molecule and the primer or probe used. Hybridizablenucleic acid molecules are useful as probes and primers, e.g., fordiagnostic applications.

[0039] Such hybridizable nucleotide sequences are useful as probes andprimers for diagnostic applications. As used herein, the term “primer”refers to a single-stranded oligonucleotide which acts as a point ofinitiation of template-directed DNA synthesis under appropriateconditions (e.g., in the presence of four different nucleosidetriphosphates and an agent for polymerization, such as, DNA or RNApolymerase or reverse transcriptase) in an appropriate buffer and at asuitable temperature. The appropriate length of a primer depends on theintended use of the primer, but typically ranges from 15 to 30nucleotides. Short primer molecules generally require coolertemperatures to form sufficiently stable hybrid complexes with thetemplate. A primer need not reflect the exact sequence of the template,but must be sufficiently complementary to hybridize with a template. Theterm “primer site” refers to the area of the target DNA to which aprimer hybridizes. The term “primer pair” refers to a set of primersincluding a 5′ (upstream) primer that hybridizes with the 5′ end of theDNA sequence to be amplified and a 3′ (downstream) primer thathybridizes with the complement of the 3′ end of the sequence to beamplified.

[0040] The invention also pertains to nucleotide sequences which have asubstantial identity with the nucleotide sequences described herein;particularly preferred are nucleotide sequences which have at leastabout 10%, preferably at least about 20%, more preferably at least about30%, more preferably at least about 40%, even more preferably at leastabout 50%, yet more preferably at least about 70%, still more preferablyat least about 80%, and even more preferably at least about 90%identity, with nucleotide sequences described herein. Particularlypreferred in this instance are nucleotide sequences encodingpolypeptides having an activity of a polypeptide described herein. Forexample, in one embodiment, the nucleotide sequence encodes a DNApolymerase, 3′-5′ exonuclease, 5′-3′ exonuclease (RNase H), DNAhelicase, or RNA ligase, as described below. In a preferred embodiment,the nucleotide encodes a DNA polymerase lacking exonuclease domains, ora 3′-5′ exonuclease lacking DNA polymerase domain, as described below.

[0041] To determine the percent identity of two nucleotide sequences,the sequences are aligned for optimal comparison purposes (e.g., gapscan be introduced in the sequence of a first nucleotide sequence). Thenucleotides at corresponding nucleotide positions are then compared.When a position in the first sequence is occupied by the same nucleotideas the corresponding position in the second sequence, then the moleculesare identical at that position. The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences (i.e., % identity=# of identical positions/total # ofpositions×100).

[0042] The determination of percent identity between two sequences canbe accomplished using a mathematical algorithm. A preferred,non-limiting example of a mathematical algorithm utilized for thecomparison of two sequences is the algorithm of Karlin et al, Proc.Natl. Acad. Sci. USA, 90:5873-5877 (1993). Such an algorithm isincorporated into the NBLAST program which can be used to identifysequences having the desired identity to nucleotide sequences of theinvention. To obtain gapped alignments for comparison purposes, GappedBLAST can be utilized as described in Altschul et al., Nucleic AcidsRes, 25:3389-3402 (1997). When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. See the programs provided by National Center forBiotechnology Information, National Library of Medicine, NationalInstitutes of Health. In one embodiment, parameters for sequencecomparison can be set at W=12. Parameters can also be varied (e.g., W=5or W=20). The value “W” determines how many continuous nucleotides mustbe identical for the program to identify two sequences as containingregions of identity.

[0043] The invention also provides expression vectors containing anucleic acid sequence encoding a polypeptide described herein (or anactive derivative or fragment thereof), operably linked to at least oneregulatory sequence. Many expression vectors are commercially available,and other suitable vectors can be readily prepared by the skilledartisan. “Operably linked” is intended to mean that the nucleotidesequence is linked to a regulatory sequence in a manner which allowsexpression of the nucleic acid sequence. Regulatory sequences areart-recognized and are selected to produce the polypeptide or activederivative or fragment thereof. Accordingly, the term “regulatorysequence” includes promoters, enhancers, and other expression controlelements which are described in Goeddel, Gene Expression Technology:Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Forexample, the native regulatory sequences or regulatory sequences nativeto bacteriophage RM 378 can be employed. It should be understood thatthe design of the expression vector may depend on such factors as thechoice of the host cell to be transformed and/or the type of polypeptidedesired to be expressed. For instance, the polypeptides of the presentinvention can be produced by ligating the cloned gene, or a portionthereof, into a vector suitable for expression in an appropriate hostcell (see, for example, Broach, et al., Experimental Manipulation ofGene Expression, ed. M. Inouye (Academic Press, 1983) p. 83; MolecularCloning: A Laboratory Manual, 2nd Ed., ed. Sambrook et al. (Cold SpringHarbor Laboratory Press, 1989) Chapters 16 and 17). Typically,expression constructs will contain one or more selectable markers,including, but not limited to, the gene that encodes dihydrofolatereductase and the genes that confer resistance to neomycin,tetracycline, ampicillin, chloramphenicol, kanamycin and streptomycinresistance. Thus, prokaryotic and eukaryotic host cells transformed bythe described expression vectors are also provided by this invention.For instance, cells which can be transformed with the vectors of thepresent invention include, but are not limited to, bacterial cells suchas Rhodothermus marinus, E. coli (e.g., E. coli K12 strains),Streptomyces, Pseudomonas, Bacillus, Serratia marcescens and Salmonellatyphimurium. The host cells can be transformed by the described vectorsby various methods (e.g., electroporation, transfection using calciumchloride, rubidium chloride, calcium phosphate, DEAE-dextran, or othersubstances; microprojectile bombardment; lipofection, infection wherethe vector is an infectious agent such as a retroviral genome, and othermethods), depending on the type of cellular host. The nucleic acidmolecules of the present invention can be produced, for example, byreplication in such a host cell, as described above. Alternatively, thenucleic acid molecules can also be produced by chemical synthesis.

[0044] The isolated nucleic acid molecules and vectors of the inventionare useful in the manufacture of the encoded polypeptide, as probes forisolating homologous sequences (e.g., from other bacteriophage species),as well as for detecting the presence of the bacteriophage in a cultureof host cells.

[0045] The nucleotide sequences of the nucleic acid molecules describedherein (e.g., a nucleic acid molecule comprising any of the open readingframes shown in FIG. 2, such as a nucleic acid molecule comprising theopen reading frames depicted in FIGS. 7-11 (SEQ ID NO:56, 58, 60, 61 and62, respectively)) can be amplified by methods known in the art. Forexample, this can be accomplished by e.g., PCR. See generally PCRTechnology: Principles and Applications for DNA Amplification (ed. H. A.Erlich, Freeman Press, New York, N.Y., 1992); PCR Protocols: A Guide toMethods and Applications (eds. Innis, et al., Academic Press, San Diego,Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991);Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds.McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202.

[0046] Other suitable amplification methods include the ligase chainreaction (LCR) (see Wu and Wallace, Genomics 4, 560 (1989), Landegren etal., Science 241, 1077 (1988), transcription amplification (Kwoh et al.,Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), and self-sustained sequencereplication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874(1990)) and nucleic acid based sequence amplification (NASBA). Thelatter two amplification methods involve isothermal reactions based onisothermal transcription, which produce both single stranded RNA (ssRNA)and double stranded DNA (dsDNA) as the amplification products in a ratioof about 30 or 100 to 1, respectively.

[0047] The amplified DNA can be radiolabelled and used as a probe forscreening a library or other suitable vector to identify homologousnucleotide sequences. Corresponding clones can be isolated, DNA can beobtained following in vivo excision, and the cloned insert can besequenced in either or both orientations by art recognized methods, toidentify the correct reading frame encoding a protein of the appropriatemolecular weight. For example, the direct analysis of the nucleotidesequence of homologous nucleic acid molecules of the present inventioncan be accomplished using either the dideoxy chain termination method orthe Maxam Gilbert method (see Sambrook et al., Molecular Cloning, ALaboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al.,Recombinant DNA Laboratory Manual, (Acad. Press, 1988)). Using these orsimilar methods, the protein(s) and the DNA encoding the protein can beisolated, sequenced and further characterized.

[0048] POLYPEPTIDES OF THE INVENTION

[0049] The invention additionally relates to isolated polypeptidesobtainable from the bacteriophage RM 378. The term, “polypeptide,” asused herein, includes proteins, enzymes, peptides, and gene productsencoded by nucleic acids described herein. In one embodiment, theinvention pertains to the polypeptides encoded by the ORFs as describedabove. In addition, as described in detail below, bacteriophage RM 378is similar to the well-known E. coli bacteriophage T4. Thus, it isexpected that bacteriophage RM 378 comprises additional polypeptidesthat are homologous to those found in bacteriophage T4.

[0050] For example, representative proteins expected to be encoded bygenes of bacteriophage RM 378 include the following: DNA topoisomerase;exonuclease (e.g., 3′-5′ exonuclease, 5′-3′ exonuclease (RNase H));helicase; enzymes related to DNA or RNA synthesis (e.g., dCTPase,dUTPase, dCDPase, dUDPase, GTPase, dGTPase, ATPase, dATPase);transposase; reverse transcriptase; polymerase (e.g., DNA polymerase,RNA polymerase); DNA polymerase accessory protein; DNA packagingprotein; DNA topoisomerase; RNA polymerase binding protein; RNApolymerase sigma factor; site-specific RNase inhibitor of protease;recombinant protein; alpha-glucosyltransferase; mobility nuclease;endonuclease (e.g., endonuclease II, endonuclease V, endonuclease VII);inhibitor of Lon protease; thymidine kinase; site-specific RNase;N-glycosidase; endolysin; lysozyme; dNMP kinase; DNA ligase;deoxyribonucleotide-3′-phosphatase; ssDNA binding protein; dsDNA bindingprotein; and RNA ligase.

[0051] In a particularly preferred embodiment, the polypeptide ispolymerase (e.g., DNA polymerase); DNA polymerase accessory protein;dsDNA binding protein; deoxyriboncleotide-3-phosphatase; DNAtopoisomerase; RNA ligase; site-specific RNase inhibitor of protease;endonuclease; exonuclease (e.g., 3′-5′ exonuclease, 5′-3′ exonuclease(RNase H)); nobility nuclease; reverse transcriptase; single-strandedbinding protein; enolysin; lysozyme; helicase;alpha-glucosyltransferase; or thymidine kinase. In an especiallypreferred embodiment, the polypeptide is a DNA polymerase, a 3′-5′exonuclease, a 5′-3′ exonuclease (RNase H), a DNA helicase, or an RNAligase, such as those shown in FIGS. 7-11 (e.g., for a DNA polymerase,SEQ ID NO:58; a 3′-5′ exonuclease, SEQ ID NO:56; a 5′-3′ exonuclease(RNase H) (SEQ ID NO:61); a DNA helicase (SEQ ID NO:62), or an RNAligase (SEQ ID NO:60)). In a most preferred embodiment, the polypeptideis a DNA polymerase that lacks exonuclease domains, or a 3′-5′exonuclease that lacks DNA polymerase domain, as described in theexamples below. As used herein, the term, “lacking exonuclease domains,”indicates that the polypeptide does not contain an amino acid domain(e.g., a consecutive or closely spaced series of amino acids) homologousto domains where such exonuclease activity resides in other similarpolymerases (such as polymerases in the same family); it does not referto the presence of a non-functional domain homologous to domains whereexonuclease activity resides. Similarly, the term, “lacking DNApolymerase domain,” indicates that the polypeptide does not contain anamino acid domain (e.g., a consecutive or closely spaced series of aminoacids) homologous to domains where such DNA polymerase activity residesin other similar exonucleases (such as exonucleases in the same family);it does not refer to the presence of a non-functional domain homologousto domains where DNA polymerase activity resides.

[0052] These polypeptides can be used in a similar manner as thehomologous polypeptides from bacteriophage T4; for example, polymerasesand ligases of bacteriophage RM 378 can be used for amplification ormanipulation of DNA and RNA sequences. The polymerases and ligases ofbacteriophage RM 378, however, are expected to be much more thermostablethan those of bacteriophage T4, because of the thermophilic nature ofthe host of bacteriophage RM 378 (in contrast with the mesophilic natureof E. coli, the host of bacteriophage T4).

[0053] The polypeptides of the invention can be partially orsubstantially purified (e.g., purified to homogeneity), and/or aresubstantially free of other polypeptides. According to the invention,the amino acid sequence of the polypeptide can be that of thenaturally-occurring polypeptide or can comprise alterations therein.Polypeptides comprising alterations are referred to herein as“derivatives” of the native polypeptide. Such alterations includeconservative or non-conservative amino acid substitutions, additions anddeletions of one or more amino acids; however, such alterations shouldpreserve at least one activity of the polypeptide, i.e., the altered ormutant polypeptide should be an active derivative of thenaturally-occurring polypeptide. For example, the mutation(s) canpreferably preserve the three dimensional configuration of the bindingsite of the native polypeptide, or can preferably preserve the activityof the polypeptide (e.g., if the polypeptide is a DNA polymerase, anymutations preferably preserve the ability of the enzyme to catalyzecombination of nucleotide triphosphates to form a nucleic acid strandcomplementary to a nucleic acid template strand). The presence orabsence of activity or activities of the polypeptide can be determinedby various standard functional assays including, but not limited to,assays for binding activity or enzymatic activity.

[0054] Additionally included in the invention are active fragments ofthe polypeptides described herein, as well as fragments of the activederivatives described above. An “active fragment,” as referred toherein, is a portion of polypeptide (or a portion of an activederivative) that retains the polypeptide's activity, as described above.

[0055] Appropriate amino acid alterations can be made on the basis ofseveral criteria, including hydrophobicity, basic or acidic character,charge, polarity, size, the presence or absence of a functional group(e.g., —SH or a glycosylation site), and aromatic character. Assignmentof various amino acids to similar groups based on the properties abovewill be readily apparent to the skilled artisan; further appropriateamino acid changes can also be found in Bowie et al. (Science247:1306-1310(1990)). For example, conservative amino acid replacementscan be those that take place within a family of amino acids that arerelated in their side chains. Genetically encoded amino acids aregenerally divided into four families: (1) acidic=aspartate, glutamate;(2) basic=lysine, arginine, histidine; (3) nonpolar=alanine, valine,leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; and(4) uncharged polar=glycine, asparagine, glutamine, cystine, serine,threonine, tyrosine. Phenylalanine, tryptophan and tyrosine aresometimes classified jointly as aromatic amino acids. For example, it isreasonable to expect that an isolated replacement of a leucine with anisoleucine or valine, an aspartate with a glutamate, a threonine with aserine or a similar conservative replacement of an amino acid with astructurally related amino acid will not have a major effect on activityor functionality.

[0056] The polypeptides of the invention can also be fusion polypeptidescomprising all or a portion (e.g., an active fragment) of the nativebacteriophage RM 378 polypeptide amino acid sequence fused to anadditional component, with optional linker sequences. Additionalcomponents, such as radioisotopes and antigenic tags, can be selected toassist in the isolation or purification of the polypeptide or to extendthe half life of the polypeptide; for example, a hexahistidine tag wouldpermit ready purification by nickel chromatography. The fusion proteincan contain, e.g., a glutathione-S-transferase (GST), thioredoxin (TRX)or maltose binding protein (MBP) component to facilitate purification;kits for expression and purification of such fusion proteins arecommercially available. The polypeptides of the invention can also betagged with an epitope and subsequently purified using antibody specificto the epitope using art recognized methods. Additionally, all or aportion of the polypeptide can be fused to carrier molecules, such asimmunoglobulins, for many purposes, including increasing the valency ofprotein binding sites. For example, the polypeptide or a portion thereofcan be linked to the Fc portion of an immunoglobulin; for example, sucha fusion could be to the Fc portion of an IgG molecule to create abivalent form of the protein.

[0057] Also included in the invention are polypeptides which are atleast about 90% identical (i.e., polypeptides which have substantialsequence identity) to the polypeptides described herein. However,polypeptides exhibiting lower levels of identity are also useful,particular if they exhibit high, e.g., at least about 90%, identity overone or more particular domains of the polypeptide. For example,polypeptides sharing high degrees of identity over domains necessary forparticular activities, such as binding or enzymatic activity, areincluded herein. Thus, polypeptides which are at least about 10%,preferably at least about 20%, more preferably at least about 30%, morepreferably at least about 40%, even more preferably at least about 50%,yet more preferably at least about 70%, still more preferably at leastabout 80%, and even more preferably at least about 90% identity, areencompassed by the invention.

[0058] Polypeptides described herein can be isolated fromnaturally-occurring sources (e.g., isolated from host cells infectedwith bacteriophage RM 378). Alternatively, the polypeptides can bechemically synthesized or recombinantly produced. For example, PCRprimers can be designed to amplify the ORFs from the start codon to stopcodon, using DNA of RM378 or related bacteriophages or respectiverecombinant clones as a template. The primers can contain suitablerestriction sites for an efficient cloning into a suitable expressionvector. The PCR product can be digested with the appropriate restrictionenzyme and ligated between the corresponding restriction sites in thevector (the same restriction sites, or restriction sites producing thesame cohesive ends or blunt end restriction sites).

[0059] Polypeptides of the present invention can be used as a molecularweight marker on SDS-PAGE gels or on molecular sieve gel filtrationcolumns using art-recognized methods. They are particularly useful formolecular weight markers for analysis of proteins from thermophilicorganisms, as they will behave similarly (e.g., they will not denatureas proteins from mesophilic organisms would).

[0060] The polypeptides of the present invention can be isolated orpurified (e.g., to homogeneity) from cell culture (e.g., from culture ofhost cells infected with bacteriophage RM 378) by a variety ofprocesses. These include, but are not limited to, anion or cationexchange chromatography, ethanol precipitation, affinity chromatographyand high performance liquid chromatography (HPLC). The particular methodused will depend upon the properties of the polypeptide; appropriatemethods will be readily apparent to those skilled in the art. Forexample, with respect to protein or polypeptide identification, bandsidentified by gel analysis can be isolated and purified by HPLC, and theresulting purified protein can be sequenced. Alternatively, the purifiedprotein can be enzymatically digested by methods known in the art toproduce polypeptide fragments which can be sequenced. The sequencing canbe performed, for example, by the methods of Wilm et al. (Nature379(6564):466-469 (1996)). The protein may be isolated by conventionalmeans of protein biochemistry and purification to obtain a substantiallypure product, i.e., 80, 95 or 99% free of cell component contaminants,as described in Jacoby, Methods in Enzymology Volume 104, AcademicPress, New York (1984); Scopes, Protein Purification, Principles andPractice, 2nd Edition, Springer-Verlag, N.Y. (1987); and Deutscher (ed),Guide to Protein Purification, Methods in Enzymology, Vol. 182 (1990).

[0061] The following Examples are offered for the purpose ofillustrating the present invention and are not to be construed to limitthe scope of this invention. The teachings of all references cited arehereby incorporated herein by reference in their entirety.

EXAMPLE 1 Isolation, Purification and Characterization of BacteriophageA. Materials and Methods

[0062] Bacterial Strains and Growth Media

[0063] The thermophilic, slightly halophilic eubacterium, Rhodothermusmarinus was first isolated from shallow water submarine hot springs inIsafjardardjup in northwest Iceland (Alfredsson, G. A. et al., J. Gen.Microbiol. 134:299-306 (1988)). Since then Rhodothermus has also beenisolated from two other areas in Iceland (Petursdottir et al., inprep.), from the Azores and the Bay of Naples in Italy (Nunes, O. C. etal, Syst. Appl. Microbiol. 15:92-97 (1992); Moreira, L. et al., Syst.Appl. Microbiol. 19:83-90 (1996)). Rhodothermus is distantly related tothe group containing Flexibacter, Bacterioides and Cytophaga species(Anderson, O. S. and Fridjonsson, O. H., J. Bacteriol. 176:6165-6169(1994)).

[0064] Strain ITI 378 (originally R-21) is one of the first Rhodothermusstrains isolated from submarine hot springs in Isafjardardjup innorthwest Iceland. The strain was grown at 65° C. in medium 162 forThermus (Degryse et al., Arch. Microbiol. 117:189-196 (1978)), with{fraction (1/10)} the buffer and with 1% NaCl. Strain ITI 378 isphenotypically and phylogenetically similar (over 99% similarity in 16srRNA sequence) to type strain DSM 4252.

[0065] Bacteriophage Isolation

[0066] A water sample with some sand and mud was collected from a hotspring (62° C.) appearing at low tide in Isafjardardjup at the same siteas the bacterium was originally isolated. The same kind of samples werecollected from the Blue Lagoon and the Salt factory on Reykjanes insouthwest Iceland.

[0067] After mixing a sample in a Waring blender, the sample wasfiltered through a Buchner funnel, followed by centrifugation, beforefiltering the water through a 0.45 μm membrane. After centrifugingagain, the sample was filtered through a sterile 0.2 μm membrane. Thisfiltrate was used for infecting 18 different Rhodothermus strains (8from Isafjardardjup in northwest Iceland, and 10 from Reykjanes insouthwest Iceland). The sample (4 ml) was mixed with 5 ml of soft agar A(the above growth medium with 2% agar) and 1 ml of overnight culture ofdifferent Rhodothermus strains. After pouring the sample onto a thinlayer agar plate, the plates were incubated for 1-2 days at 65° C. Asingle, well-isolated plaque was stabbed with a sterile Pasteur pipetteand dissolved in 100 μl of 10 mM MgCl₂ solution (forming the plaquesolution).

[0068] The bacteriophage is sensitive to freezing; it can be stored in acell lysate at 4° C. (e.g., as described below under “Liquid Lysate”).

[0069] Plate Lysate

[0070] Overnight culture (0.9 ml) was mixed with 100 μl of the plaquesolution and incubated for 15 minutes at 65° C. before adding 3 ml ofsoft agar B (same as A, but 1% agar and 10 mM MgCl₂). After mixing andpouring onto thin layer agar plates, the plates were incubated for 1-2days at 65° C. To nearly totally lysed plates was added 1 ml of 10 mMMgCl₂, and after incubating at 4° C. for a few hours, the top layer wasscraped off and put into a sterile tube. After adding 100 μl chloroformand mixing it, the sample was centrifuged and the supernatant collected.The sample was centrifuged again and filtered through a 0.2 μm filter;the filtrate was stored at 4° C. This lysate was used for testing hostspecificity.

[0071] Liquid Lysate

[0072] Liquid cultures were infected when they had reached an absorbanceof 0.5 at 600 nm (expected to contain 2.5×10⁸ cells/ml). The phage ratiowas 0.1 pfu/cell culture. The cultures were incubated at high shaking(300 rpm) and growth was followed by measuring absorbance at 600 nm.When lysis had occurred, chloroform was added to the cultures (10 μl/ml)and shaking continued for 1 hour. Cell debris was removed bycentrifugation and titer estimation was performed on the supernatant.large-scale purification from 300 ml culture was undertaken for DNAisolation and for protein composition analysis, as well as for electronmicrocopy.

[0073] Bacteriophage Purification

[0074] For electron microscopy, the bacteriophages were precipitatedusing PEG 8000 (Sambrook, J. et al., Molecular Cloning, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989) and resuspendedin SM buffer (Sambrook, J. et al, Molecular Cloning, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1989) before loading on thetop of CsCl (0.75 g/ml). This sample was centrifuged for 23 hours at38,000 rpm in TY-64 rotor (Sorvall Ultracentrifuge). The layer ofbacteriophage was collected using a syringe.

[0075] Protein Determination and DNA Isolation

[0076] Purified bacteriophage supernatant with a titer of approximately10¹³ pfu/ml was boiled for 5 minutes in SDS and β-mercaptoethanolloading puffer according to the method of Laemmli (Laemmli, U. K.,Nature 227:680-685 (1970)) using 10% polyacrylamide gel, and stainedwith Coomassie brilliant blue. Bio-Rad pre-stained low molecular weightstandards (7.7-204 kDa) were used as size markers. Bacteriophage DNA wasisolated from a purified phage lysate containing approximately 10¹³pfu/ml using the Qiagen lambda kit (Catolog No. 12543, Qiagen) accordingto manufacturer's instructions.

[0077] Temperature and Chloroform Sensitivity

[0078] Bacteriophage RM 378 at approximately 10¹¹ pfu/ml was incubatedfor 30 minutes over a temperature range of 50-96° C. before theremaining bacteriophage titer was determined. The bacteriophage lysateat approximately 10¹¹ pfu/ml was mixed with an equal volume ofchloroform, and incubated at room temperature. After 30 minutes, theremaining viable bacteriophage were titrated with strain ITI 378 as ahost.

[0079] Determination of G+C Content

[0080] The mole percent guanine plus cytosine content of thebacteriophage was determined by CSM with HPLC according to Mesbah(Mesbah, M. U. et al., Int. J Syst. Bacteriol 39:159-167 (1989)).

[0081] Estimation of Genome Size

[0082] Bacteriophage DNA was digested individually with a variety ofrestriction endonucleases, and the fragments separated byelectrophoresis on 0.5-0.8% (w/v) agarose gel. Pulsed-field gelelectrophoresis (PFGE) was also used for size estimation. Pulsed FieldCertified Agarose from BioRad (Catalog No. 162-0137, Bio Rad) (1%) wasused for the gel, and low-melt agarose (Catalog No. 162-0017, Bio Rad)(1%) for filling the wells when using marker plugs. Samples of 1.0 and0.5 μg DNA were used and Bio Rad low range marker (#350) as well asλ-ladder (Catalog No. 170-3635, Bio Rad) was employed. The runningbuffer was 0.5×TBE (Sambrook, J. et al., Molecular Cloning, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989). Bio Rad PulsedField Electrophoresis system (CHEF-DRIII) was used with an initialswitch time of 60 seconds, final switch time of 60 seconds, 6 V/cm angleof 120° and 21 hour run time. Gels were stained with ethidium bromideand washed in distilled water for 3 hours before photographing under aUv light illuminator.

[0083] Electron Microscopy

[0084] The bacteriophage was stained with 2.5% phosphotungstic acid andthe grids examined with a Philips EM 300 electron microscope.Bacteriophage samples from CsCl purification, as well as directly from aliquid lysed culture with titer of 10¹³ pfu/ml, were used for microscopystudies.

[0085] DNA Sequencing and Genome Analysis

[0086] The phage genome was sequenced using the “shot gun sequencing”technique (see, e.g., Fleischmann, R. D. et al., Science 269:496-512(1995)). The sequences were aligned (Ewing, B., et al., Genome Research8:175-185 (1998)); Ewing, B. and Green, P., Genome Research 8:186-194(1998)). The consensus sequence of 130,480 bp was visualized with theprogram XBB-Tools (Sicheritz-Ponten, T., Department of MolecularEvolution, Uppsala, Sweden) for open reading frames (ORFs).

B. Results

[0087] Bacteriophage Isolation

[0088] The phage sample from the southwest area of Iceland, prepared asdescribed above, infected 4 strains of Rhodothermus, all from Reykjanesin southwest Iceland. The phage sample from the northwest area ofIceland, prepared as described above, infected 7 strains ofRhodothermus, all from Isafjardardjup in northwest Iceland.Bacteriophages were isolated from two of the strains infected with thesample from the southwest, and from all 7 of the strains infected withthe sample from the northwest. Of these, one of the bacteriophages fromthe sample from the northwest was isolated from strain ITI 378 anddesignated RM 378. The titer of this bacteriophage was estimated; inliquid culture it repeatedly gave titers of 5-8×10¹³ pfu/ml.

[0089] Attempts to isolate the bacteriophages from Rhodothermus bysubjecting it to stress such as ultraviolet (UV) exposure did notsucceed. Because such stress would have excised a prophage from thechromosome and have initiated a lytic response, the failed attemptssuggest that Rhodothermus did not contain prophages.

[0090] Bacteriophage Morphology

[0091] Bacteriophage RM 378 is a tailed phage with a moderatelyelongated head. It is a T4-like phage, resembling the T4 phage ofEscherichia coli both in morphology and genome size, and has adouble-stranded DNA genome. RM 378 belongs to the Myoviridae family andhas the A2 morphology (Ackermann, H. W., Arch. Virol. 124:201-209(1992)). The bacteriophage head measures 85 nm on one side and 95 nm onthe other. The tail is 150 nm in length, with a clear right-handedspiral to the tail sheath. The head/tail ratio is 0.63 and the totallength is 245 nm.

[0092] Host Specificity and Infection

[0093] RM 378 concentrated bacteriophage was tested against 9 differentRhodothermus strains from the two different areas (Isafjardardjup innorthwest Iceland, and Reykjanes in southwest Iceland). It infected 5strains from the northwest, but no strains from the southwest. Thus, thebacteriophage infected only strains of Rhodothermus from the samegeographical area from which the bacteriophage was isolated. It did notinfect any of the 6 Thermus strains that were tested.

[0094] Growth of bacteria was followed at 65° C. in a liquid. Uninfectedculture was used as control, and growth was followed until the controlculture had reached stationary phase. Cell lysis started 9 hours afterinfection of the culture, and stationary phase in the control wasreached about 14 hours after infection.

[0095] Stability of the Bacteriophage

[0096] Bacteriophage RM 378 was stable to 30 minutes exposure tochloroform, indicating that it probably does not contain lipids. Heatstability of the phage was tested at 50° C.-96° C. by incubating thephage concentrate for 30 minutes, followed by estimation of titer. Therewas no change of the titer up to 65° C., but at 70° C. and 80° C. a100-fold drop in pfu/ml was measured. Linear decrease of the titer wasobserved up to 96° C., where it was 10,000 times lower after 30 minutesthan in the starting solution. After 3 months of storage at 4° C. thetiter dropped 100-fold (down to 10¹¹ pfu/ml). After 27 months of storagethe titer had fallen from 10¹¹ pfu/ml to 10⁵ pfu/ml in a CsCl-purifiedsample.

[0097] Composition of Bacteriophage RM 378

[0098] Purified bacteriophage was subjected to SDS-PAGE analysis forexamination of its protein composition. The phage was composed of atleast 16 proteins with apparent molecular weights from 23-150 kDa. Thefive main bands were at 92, 61, 52, 50 and 26 kDa, and were in a ratioof 0.14:0.45:0.21:0.13:0.06. The major protein band of 61 kDa accountedfor about 20% of the total protein; the five main bands togetherrepresented about 50% of total proteins.

[0099] The average G+C mol % of the RM 378 phage was 42.0±0.1. The DNAwas digested with a variety of restriction enzymes (HindIII, XhoI, ClaI,AluI, NotI, SacI, PstI, BamHI, SmaI, SpeI, EcoRV). Three of the enzymes(NotI, SmaI, SpeI) did not cleave RM 378, and the rest resulted inmultiple fragments. Because the addition of the fragment sizes resultedin a variable amount for the total genome size, the phage DNA was alsorun on PFGE, which estimated the size of the DNA to be about 150 kb.

[0100] Characteristics of the Bacteriophage

[0101] The RM 378 bacteriophage is a virulent bacteriophage following alytic cycle of infection. Very high titer lysates of up to 10¹³ pfu/mlcould be obtained, which indicated a large burst size of more than 100.Because no bacteriophages have been reported against this bacterialgenus, RM 378 represents a new species.

[0102] Genome Analysis and Comparison to T4 Bacteriophage

[0103] The nucleic acid sequence of RM 378 is set forth in FIG. 1. Thenucleic acid sequence of RM 378 contains at least 200 open readingframes (ORFs); see, for example, the ORFs described in FIG. 2. Of these,five were identified in more detail, as described in Example 2,including the ORFs expected to encode DNA polymerase, 3′-5′ exonuclease,5′-3′ exonuclease, RNA ligase and DNA helicase.

[0104] RM 378 belongs in the T-even family, in that it is similar tobacteriophage T4 of Escherichia coli. Bacteriophage T4 of E. coil is awell-studied phage which, together with T2 and T6, belongs to the familyof bacteriophages known as T-even phages. T-even phages are nearlyidentical not only in structure and composition, but also in properties.Several enzymes isolated from bacteriophage T4 are used in the field ofrecombinant DNA technology as well as in other commercial applications.For example, T4 DNA polymerase, T4 DNA ligase and T4 RNA ligase arefrequently used in the research industry today.

[0105] The genome of RM 378 was aligned in a consensus sequence, and theopen reading frames (ORFs) were analyzed and compared to the T4bacteriophage genome. The overall genome arrangement seemed to bedifferent and the overall similarity to known proteins was low. However,despite this apparently high genetic divergence, several structural andmorphological features were highly conserved. Furthermore, homologs toproteins in T4 were identified in the RM 378 bacteriophage. Thesesimilarities are set forth in Table 1, below.

[0106] In view of the similarities between bacteriophage T4 andbacteriophage RM 378, it is reasonable to expect that bacteriophage RM378 comprises genes that are homologous to those found in bacteriophageT4, and that these genes in bacteriophage RM 378 encode proteins andenzymes that correlate to those proteins and enzymes found inbacteriophage T4.

EXAMPLE 2 Detailed Analysis of Five Open Reading Frames (ORFs) A.Selection of Reading Frames for Analysis

[0107] Five open reading frames (ORFs) of the numerous ORFs describedabove in the genome of bacteriophage RM378, have been furthercharacterized and the corresponding genes cloned and expressed. Thegenes include a DNA polymerase, 3′-5′ exonuclease, 5′-3′-exonuclease(RNase H), replicative DNA helicase and RNA ligase. These genes werechosen as examples of the many valuable genes encoded by thebacteriophage genome. The corresponding polypeptide products of thesegenes are mainly components of the bacteriophage replication machineryand can be utilized in various molecular biology applications as evidentby the current use of homologous counterparts from other sources. Thesequences of the five ORFs show low similarity to sequences in publicdatabases indicative of distant relationship to known proteins; however,probable homology to known sequences can be established by comparisonwith families of sequences showing overall sequence similarity as wellas conservation of shorter regions, sequence motifs and functionallyimportant residues, in some cases aided by three-dimensional structuralinformation. The limited sequence similarity or these sequences topublicly available sequences suggests that these gene products havefunctional properties very different from corresponding proteinscurrently in use in molecular biology applications. Together with thepresumed thermostability, the properties of these gene products renderthem valuable in various applications in molecular biology.

[0108] DNA Polymerase

[0109] DNA polymerases have evolved to accommodate the varied tasksrequired for replication and repair. DNA replication involves 1) localmelting of the DNA duplex at an origin of the replication, 2) synthesisof a primer and Okazaki fragment, 3) DNA melting and unwinding at thereplication fork, 4) extension of the primer on the leading strand anddiscontinuous synthesis of primers followed by extension of the laggingstrand, 5) removal of RNA primers and 6) sealing of nicks. (Perler etal., Adv Protein Chem 48:377-435 (1996)).

[0110] The different types of DNA polymerases have been grouped intoFamilies A, B, C and X corresponding to similarity with E. coli pol I,II and III and pol b respectively (Braithwaite, D. K. and Ito, J.,Nucleic Acids Res. 21:787-802 (1993)). Each of these Families containsconserved sequence regions (Perler et al., Adv Protein Chem. 48:377-435(1996); Blanco L., et al., Gene 100:27-38 (1991); Morrison A. et al.,Proc Natl Acad Sci USA. 88:9473-9477 (1991)). Family B DNA polymeraseseare also called Pol α Family DNA polymerases.

[0111] The DNA polymerases of family B type include bacteriophage T4 andbacteriophage RB69 DNA polymerase as well as archaeal polymerases and E.coli polymerase II. Polymerases of this type normally have twoactivities, the polymerase activity and the proofreading 3′-5′exonuclease activity, found in different domains within the samepolypeptide with the exonuclease domain being N-terminal to thepolymerase domain (Steitz, T. A., J Biol Chem 274:17395-8 (1999);Komberg, A. and Baker, T. A., DNA Replication, Freeman, N.Y. (1992);Brautigam, C. A. and Steitz, T. A., Curr.Opin.Struct.Biol. 8:45-63(1998) ). Polymerases of family B have an overall domain architecturedifferent from polymerases of family A and do not have a 5′-3′exonuclease activity which is normally found in polymerases in family A.The determined structure of RB69 DNA polymerase is a representativestructure of family B type polymerase and shows clearly the modularorganization of the enzyme with separate domains (Wang, J. et al., Cell89:1087-99 (1997), Protein data bank (PDB) accession code 1WAJ). Thestructure of the archaeal DNA polymerase from Desulfurococcus strain Tokwas shown to have the same overall structure (Zhao, Y. et al., StructureFold Des 7:1189-99 (1999), PDB accession code 1QQC ). The alignment ofpolymerases in this family indicates the presence of several conservedregion in the sequences with characteristic sequence motifs bothbelonging to both the exonuclease domain and the polymerase domain (Hopfner, K. P. et al., Proc Natl Acad Sci USA 96:3600-3605 (1999)).

[0112] Exonucleases

[0113] Besides the basic polymerization function, DNA polymerases maycontain 5′-3′ and a 3′-5′ exonuclease activity. The 3′-5′ exonucleaseactivity is required for proofreading. In general the family Bpolymerases have 3′-5′ exonuclease activity, but not 5′-3′ exonucleaseactivity. If both exonucleases are present, the 5′-3′ exonuclease domainis at the N-terminal followed by the 3′-5′ exonuclease domain and theC-terminal polymerase domain. The structure of the polymerases can bedefined further in terms of domain structure. The polymerase domain isthus composed of a number of smaller domains, often referred to as thepalm, fingers and thumb, and although these parts are not homologousacross families, they do show analogous structural features (Steitz, T.A., J Biol Chem 274:17395-8 (1999); Komberg, A. & Baker, T. A., DNAReplication, Freeman, N.Y. (1992); Brautigam, C. A. & Steitz, T. A.,Curr.Opin.Struct.Biol. 8:45-63 (1998) ).

[0114] RNase H ( Ribonuclease H), e.g. from bacteriophage T4, removesthe RNA primers that initiate lagging strand fragments, during DNAreplication of duplex DNA. The enzyme has a 5′-3′ exonuclease activityon double-stranded DNA and RNA-DNA duplexes. Further, T4 RNase H has aflap endonuclease activity that cuts preferentially on either side ofthe junction between single and double-stranded DNA in flap and fork DNAstructures. Besides replication, T4 RNase H also plays a role in DNArepair and recombination. (Bhagwat, M., et al., J. Biol. Chem.272:28531-28538 (1997); Bhagwat, M., et al. J. Biol. Chem.272:28523-28530 (1997)).

[0115] T4 RNase H shows sequence similarity to other enzymes with ademonstrated role in removing RNA primers, including phage T7 gene 6exonuclease, the 5′-3′ nuclease domain of E. coli DNA polymerase I, andhuman FEN-1 (flap endonuclease). These enzymes have 5′-3′-exonucleaseactivity on both RNA-DNA and DNA-DNA duplexes and most of them have aflap endonuclease activity that removes the 5-ssDNA tail of flap or forkstructures. The T4 enzyme homologous to members of the RAD2 family ofprokaryotic and eukaryotic replication and repair nucleases (Mueser T.C., et al., Cell. 85:1101-1112 (1996)).

[0116] RNase H is a part of the reverse transcriptase complex of variousretroviruses. The HIV-1 RT associated ribonuclease H displays bothendonuclease and 3′-5′ exonuclease activity (Ben-Artzi, H., et al.,Nucleic Acids Res. 20:5115-5118 (1992); Schatz, O., et al., EMBO J.4:1171-1176 (1990)).

[0117] In molecular biology, RNase H is applied to the replacementsynthesis of the second strand of cDNA. The enzyme produces nicks andgaps in the mRNA strand of the cDNA:mRNA hybrid, creating a series ofRNA primers that are used by the corresponding DNA polymerase during thesynthesis of the second strand of cDNA (Sambrook, J., et al., Molecularcloning: a laboratory manual, 2nd ed. Cold Spring Harbour LaboratoryPress (1989)). The RNase H of E. coli can promote the formation andcleavage of RNA-DNA hybrid between an RNA site and a base paired strandof a stable hairpin or duplex DNA at temperature below their Tm (Li. J.,and R. M. Wartell, Biochemistry 37:5154-5161 (1998); Shibahara, S., etal, Nucleic Acids Res. 15:4403-4415 (1987)). Thus, the enzyme has beenused for site-directed cleavage of RNA using chimeric DNA splints(presence of complementary chimeric oligonucleotides) (Inoue, H., etal., Nucleic Acids Symp Ser. 19:135-138 (1988)) or oligoribonucleotidecapable of forming a stem and loop structure (Hosaka H., et al., J.Biol. Chem. 269: 20090-20094 (1994)).

[0118] DNA Helicase

[0119] DNA helicases use energy derived from hydrolysis of nucleosidetriphosphate to catalyze the disruption of the hydrogen bonds that holdthe two strands of double-stranded DNA together. The reaction results inthe formation of the single-stranded DNA required as a template orreaction intermediate in DNA replication, repair or recombination(Matson, S. W., et al., BioEssays. 16:13-21 (1993)).

[0120] The bacteriophage T4 Gp4l is a highly processive replicativehelicase (similar to the DNA B protein of E. coli) and has been shown toform hexamer in the presence of ATP (Dong, F., and P. H. von Hippel, J.Biol. Chem. 271:19625-19631 (1996)). The enzyme facilitates theunwinding of DNA helix ahead of the advancing DNA polymerase andaccelerates the movement of the replication fork. It has been suggestedthat gp41 interacts with the polymerase holoenzyme at the replicationfork (Schrock R. D. and B. Alberts, J. Biol. Chem. 271:16678-16682(1996)). Gp4l has a 5′-3′ polarity and requires a single stranded regionon the 5′ side of the duplex to be unwound. The ATP-activated helicasebinds to a single gp61 primase molecule on appropriate DNA template(Morris, P. D., and K. D. Raney, Biochemistry. 38:5164-5171 (1999)) toreconstitute a stable primosome (Richardson, R. W. and N. G. Nossal, J.Biol. Chem. 264:4725-4731 (1989)). Although the gp41 alone does not forma stable complex with DNA template, this helicase by itself can carryout moderately processive ATP-driven translocation along single strandDNA (Dong, F., and P. H. von Hippel. J. Biol. Chem. 271:19625-19631(1996)). The T4 gene 59 protein accelerates the loading of gp41 ontoDNA, when it is covered with 32 protein (the T4 single strand bindingprotein), and stimulates the helicase activity to catalyze replicationfork movement through a DNA double helix, even through a promoter-boundRNA polymerase molecule (Barry, J., and B. Alberts. J. Biol. Chem.269:33063-33068 (1994); Tarumi, K., and T. Yonesaki, J Biol Chem.270:2614-2619 (1995)). The T4 gp41 helicase has also been disclosed toparticipate in DNA recombination. Following exonuclease nicking of dsDNA and further expansion into a gap, gp41 creates a free 3′ end, whichis required as a substrate by recombination proteins (RecA like)(Tarumi, K., and T. Yonesaki. J Biol Chem. 270:2614-2619 (1995)).

[0121] RNA Ligase

[0122] RNA ligase is abundant in T4-infected cells and has been purifiedin high yields. Bacteriophage T4 RNA ligase catalyzes the ATP-dependentligation of a 5′-phosphoryl-terminated nucleic acid donor (i.e. RNA orDNA) to a 3′-hydroxyl-terminated nucleic acid acceptor. The reaction canbe either intramolecular or intermolecular, i.e., the enzyme catalyzesthe formation of circular DNA/RNA, linear DNA/RNA dimers, and RNA-DNA orDNA-RNA block co-polymers. The use of a 5′-phosphate, 3′-hydroxylterminated acceptor and a 5′-phosphate, 3′-phosphate terminated donorlimits the reaction to a unique product. Thus, the enzyme can be animportant tool in the synthesis of DNA of defined sequence (Marie I., etal., Biochemistry 19:635-642 (1980), Sugion, A. et al., J. Biol. Chem.252:1732-1738 (1977)).

[0123] The practical use of T4 RNA ligase has been demonstrated in manyways. Various ligation-anchored PCR amplification methods have beendeveloped, where an anchor of defined sequence is directly ligated tosingle strand DNA (following primer extension, e.g. first strand cDNA).The PCR resultant product is amplified by using primers specific forboth the DNA of interest and the anchor (Apte, A. N., and P. D. Siebert,BioTechniques. 15:890-893 (1993); Troutt, A. B., et al., Proc. Natl.Acad. Sci. USA. 89: 9823-9825 (1992); Zhang, X. H., and V. L. Chiang,Nucleic Acids Res. 24:990-991(1996)). Furthermore, T4 RNA ligase hasbeen used in fluorescence-, isotope- or biotin- labeling of the 5′-endof single stranded DNA/RNA molecules (Kinoshita Y., et al., Nucleic AcidRes. 25: 3747-3748 (1997)), synthesis of circular hammer head ribozymes(Wang, L., and D. E. Ruffner. Nucleic Acids Res 26: 2502-2504 (1998)),synthesis of dinucleoside polyphosphates (Atencia, E. A., et al. Eur. J.Biochem. 261: 802-811 (1999)), and for the production of compositeprimers (Kaluz, S., et al., BioTechniques. 19: 182-186 (1995)).

B. DNA Polymerase Activity and 3′-5′ Exonuclease Activity Are Found inGene Products of Separate Genes in the Phage RM378 Genome

[0124] The predicted gene products of two open reading frames (ORF056eand ORF632e), which are widely separated in the genome of phage RM378,both showed similarity to family B type polymerases as shown below.

[0125] Identification of the ORF056e Gene Product as 3′-5′ Exonuclease

[0126] The predicted gene product of ORF056e (locus GP43a) was runagainst a sequence database (NCBI nr) in a similarity search using BLAST( Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1990)) (Table 2).Out of 64 hits with E value lower (better) than 1, all sequences were ofDNA polymerases of family B type including DNA polymerase frombacteriophage RB69, archaeal DNA polymerases and E. coli polymerase II.Importantly, all these sequences are DNA polymerase sequences having thesequence characteristics of the DNA polymerase domain as well as the3′-5′ exonuclease domain and are considerably longer (excluding partialsequences) than the predicted gene product of ORF056e which has a lengthof 349 residues. The similarity is restricted to the N-terminal halvesof these sequences corresponding to the part of the protein where the3′-5′ proofreading exonuclease domain is located.

[0127] Table 2 lists the 20 sequences with strongest similarity to theORF056e sequence together with the length and E-value according to BLASTsearch. The sequence identity with the ORF056e sequence ranges from 21to 27%. Of the 64 sequences identified in the sequence database, 34 areof viral origin and 15 of archaeal origin. Out of the twenty top scoringsequences, 16 are of viral origin.

[0128] Identification of the ORF632e Gene Product as DNA Polymerase

[0129] The sequence similarity program BLAST (Altschul, S. F. et al., J.Mol. Biol. 215:403-410 (1990)) was also used to identify potentialhomologues of the ORF632e (locus GP43b) gene product. The 100 sequencesin the sequence database (NCBI nr) with the strongest similarity to theORF632e sequence were all defined as DNA polymerase sequences. Thesesequences all had an E value lower than 10-5 and are considerably longer(excluding partial sequences) than the predicted gene product of ORF632ewhich has a length of 522 residues (Table 3). Sequence alignmentsbetween the ORF632e sequence and the sequences identified in thedatabase shows that the similarity is restricted to a domain with theDNA polymerase activity as characterized by conserved sequence motifssuch as DxxSLYPS (Hopfner, K. P. et al., Proc Natl Acad Sci USA96:3600-3605 (1999)). In these sequences this domain is always precededby a long N-terminal region where the 3′-5′ exonuclease activitynormally is found. The corresponding N-terminal region is lacking inORF362e which consists only of the DNA polymerase domain (family B typepolymerases). The sequence motif DXXSLYPS (SEQ ID NO:63) in the ORF632esequence is found very close to its N-terminus unlike its location inall the 100 analyzed sequences in the public database.

[0130] Table 3 lists the 20 sequences with strongest similarity to theORF632e sequence together with the length and E-value according to aBLAST search. The sequence identity with the ORF632e sequence rages from23 to 28% within aligned regions of 300 to 428 residues. The majority ofthese 20 sequences are of archaeal DNA polymerases of family B type.

[0131] The results of the similarity searches indicated that geneproducts of ORF056e and ORF632e correspond to the exonuclease domain andthe polymerase domain of family B type polymerases, respectively.Partial alignment of sequences of a number of members of this family wasobtained from the Protein Families Data Bases of Alignments and HMMs(Sanger Institute), accession number PF00136). The sequences of ORF056eand ORF632e could be combined as one continuous polypeptide and alignedto the previous set of sequences. The coordinates of thethree-dimensional structures of DNA polymerases from bacteriophage RB69(PDB ID 1 WAJ), the archaea Thermococcus gorgonarius (PDB ID 1TGO) andthe archaea Desulforococcus strain Tok (PDB ID 1 QQc) were structurallyaligned and the sequence alignment produced from the structuralalignment. The corresponding sequences were added to the previousalignment and the alignment adjusted, guided by the alignment from thestructural superposition, mainly in regions which are less conserved.The resulting alignment, shown in FIG. 3, strongly supports the previousinterpretation that 3′-5′ proofreading activity and DNA polymeraseactivity are found in two proteins encoded by separate genes inbacteriophage RM378. As seen in the alignment (FIG. 3), the majorconserved regions in this protein family in the 5′-3′ exonuclease domainand in the polymerase domain are also conserved in the gene products ofORF056e and ORF632e, respectively. As defined by Hopfner et al.(Hopfner, K. P. et al., Proc Natl Acad Sci USA 96:3600-3605 (1999)),this includes regions exo I, -II and -III in the exonuclease domain andmotifs A, -B and -C in the polymerase protein. Motif A corresponds tothe DxxSLYPS motif mentioned above and includes an aspartic acidresidue, involved in coordinating one of the two Mg2+ ions which areessential for the polymerase activity, and a tyrosine residue whichstacks it side chain against an incoming nucleotide in the polymerasereaction. Another aspartic residue which also acts as Mg2+ ion ligand(motif C), and is essential for the catalytic mechanism, is also foundin the sequence of ORF632e (D215). Inspection of the three-dimensionalstructure of bacteriophage RM69 DNA polymerase (PDB ID 1 WAJ), withrespect to the alignment, shows that the end of the ORF056e sequence andthe beginning of the ORF632e sequence are found between the 3′-5′exonuclease domain and the DNA polymerase domain.

[0132] The polymerase activity encoded by bacteriophage RM378 thusresides in an enzyme which is relatively short corresponding only to thepolymerase domain of other members in this family and unlike thoserelatives does not have an 3′-5′ exonuclease domain. The 3′-5′exonuclease is found as another protein encoded by a separate geneelsewhere in the genome. The natural form of DNA polymerase from Thermusaquaticus (Taq) also lacks the proofreading 3′-5′ exonuclease activitybut this polymerase differs from the polymerase of RM378 in severalaspects: i) it belong to a different family of polymerase (family A)which have a different general architecture, ii) the lack of 3′-5′exonuclease activity is due to a non-functional domain since it stillcontains a structural domain homologous to a domain where this activityresides in other polymerase in this family, and iii) naturally occurringTaq has 5′-3′ exonuclease activity besides its polymerase activity (Kim,Y. et al., Nature 376:612-616 (1995)). Thus, the current protein is theonly known example of a DNA polymerase which by nature lacksproofreading activity and the corresponding structural domain present inother polymerases of this type, and therefore represents the discoveryof a unique compact type of DNA polymerase found in nature lacking both3′-5′ and 5′-3′ exonuclease activity.

C. ORF739f Encodes an RNA Ligase

[0133] Several sequences of RNA ligases in a protein sequence databaseshowed similarity to the ORF739f sequence (locus GP63) as identified ina similarity search using BLAST (Altschul, S. F. et al., J. Mol. Biol.215:403-410 (1990)). The top scoring sequences found in the BLAST searchare show in Table 4. Only 3 sequences showed a score with E-avlue below1.0. The two most significant and extensive similarities were found tothe sequences of RNA ligases from Autographa californicanucleopolyhedrovirus and bacteriophage T4. The similarity to the thirdsequence, that of a DNA helicase, is much less extensive and hasconsiderable higher E-value. The sequence identity between the ORF739fsequence and the two RNA ligase sequences is 23% over regions of 314 and381 residues. A sequence alignment of these three sequences is shown inFIG. 4.

[0134] The site of covalent reaction with ATP (adenylation) has beenlocated at residue K99 in bacteriophage T4 RNA ligase (Thogersen H C, etal., Eur J Biochem 147:325-9 (1985);Heaphy, S., Singh, M. and Gait, M.J., Biochemistry 26:1688-96 (1999)). A corresponding Lysine residue(K126) is also found in the sequence of ORF739f. An aspartic residueclose to the adenylation site in T4 RNA ligase has also been implied asimportant for the catalytic mechanism (Heaphy, S., Singh, M. and Gait,M. J., Biochemistry 26:1688-96 (1999)). This residue is also conservedin ORF739f(D128). It has been suggested that the motif KX(D/N)G may be asignature element for covalent catalysis in nucleotidyl transfer (Cong,P., and Shuman, S., J Biol Chem 268:7256-60 (1993)). The conservation ofthese active site residues supports the interpretation of ORF739f geneproduct as RNA ligase having catalytic mechanism in common with otherRNA ligases and involving covalent reaction with ATP.

[0135] Table 4 shows sequences with strongest similarity (E-value cutoffof 1.0) to the ORF739f sequence together with their length and E-valueaccording to BLAST search.

D. Orf 1218a Encodes a Gene Product with 5′-3′ Exonuclease Activity

[0136] A BLAST search (Altschul, S. F. et al., J. Mol. Biol. 215:403-410(1990)) identified about 60 sequences in the database (NCBI nr) withsignificant similarity (corresponding to E-value lower than 1) to thesequence of the predicted gene product of ORF 1218a (locus DAS). Almostall the identified sequences are of DNA polymerase I from bacterialspecies (DNA polymerase family A) and the similarity is restricted tothe N-terminal halves of these sequences and the ORF 1218a sequence ismuch shorter, 318 residues, compared to the identified sequences whichusually are between 800 and 900 residues (Table 5).

[0137] Structural and functional studies of DNA polymerases of this type(family A) have defined the different structural domains and how thesecorrelate with the different activities of the enzyme. Polymerases ofthis type normally have a polymerase activity located in a C-terminaldomain and two exonuclease activities, a 3′-5′ exonuclease proofreadingactivity in a central domain and a 5′-3 exonuclease activity in anN-terminal domain (Komberg, A. and Baker, T. A., DNA Replication,Freeman, N.Y. (1992); Brautigam, C. A. and Steitz, T. A., Curr. Opin.Struct. Biol. 8:45-63 (1998)). The sequence of ORF 1218a corresponds tothe 5′-3′ exonuclease domain of these polymerases.

[0138] The 5′-3′ exonuclease domain of DNA polymerase I belongs to alarge family of proteins which also include ribonuclease H (RNase H)including bacteriophage T4 RNase H. The analysis of the structure ofbacteriophage T4 RNase H revealed the conservation of a several acidicresidues in this family of proteins. These residues are clustered at theactive site, some of which help coordinate two functionally importantMg2+ ions (Mueser, T. C.,et al., Cell 85:1101-12 (1996)). Thecorresponding alignment shown in FIG. 5, including the sequence of theORF 1218a gene product, shows that these acidic residues (possibly withthe exception of one) are also found in the gene product of ORF1218athus further supporting its proposed activity as 5′-3′ exonuclease.

[0139] The 5′-3′ exonuclease of polymerase I and RNase H both remove RNAprimers that have been formed during replication but T4 DNA polymerasesand other polymerases of the same type (family B), including theidentified polymerase of phage RM378 identified here (see above), lackthe 5′-3′ exonuclease activity. T4 RNase H (305 residues) and theORF1218a gene product (318 residues) are of similar size with conservedregions scattered throughout most of the sequences (FIG. 5). Theseproteins are likely to have a very similar structure given thestructural similarity between T4 RNase H and 5′-3′ exonuclease domain ofpolymerase I (Mueser, T. C., et al., Cell 85:1101-12 (1996)). The geneproduct of ORF1218a probably has a function analogous to the function ofRNase H in bacteriophage T4.

[0140] Table 5 sets forth the 21 sequences with strongest similarity tothe ORF1218a sequence together with the length and E-value according toBLAST search. The sequence identity with the ORF1218a sequence rangesfrom 31 to 41% within aligned regions of 82 to 145 residues.

E. A Replicative DNA Helicase is Part of the Replication Machinery ofPhage RM378

[0141] Several sequences of replicative DNA helicases were identified ina similarity search using BLAST (Altschul, S. F., et al., J. Mol Biol.215:403-410 (1990)) with the ORF1293b (locus GP41) sequence as querysequence. 15 sequences had an E-value lower than 1.0 with the sequenceof bacteriophage T4 replicative DNA helicase (product of gene 41,accession number P04530) having by far the lowest E-value. Some of thesequences found in the similarity search are hypothetical proteins andsome are defined as RAD4 repair protein homologues. However, the mostextensive similarity was found with the replicative helicase sequences,with sequence identity of 20-23% spanning 210-295 residues, and thesesequences are all of length similar to the length of the ORF1293b geneproduct (416 residues). Table 6 shows the identified sequences of thesimilarity search.

[0142] The replicative DNA helicases with similarity to the ORF1293bsequence are of the same protein family often named after thecorresponding helicase in E. coli encoded by the DnaB gene (e.g.DnaB-like helicases). The Protein Families Data Base of Alignments andHMMs (Sanger Institute), holds 37 sequences in this family (family DnaB,accession number PF00772;) and the alignment of these sequences showsclearly several regions with conserved sequence motifs. One of thismotif is characteristic for ATPases and GTPases (Walker A motif, P-loop)and forms a loop that is involved in binding the phosphates of thenucleotide (Sawaya, M. R. et al., Cell 99:167-77 (1999)). Thereplicative helicases bind single stranded DNA (at the replication fork)and translocate in the 5′-3′ direction with ATP (GTP) driventranslocation (Matson, S. W., et al., BioEssays 16:13-22 (1993)). Thesignificant similarity found in the BLAST search to sequences other thanhelicase sequences is partly due to the presence of an ATP/GTP bindingsequence motif in these sequences.

[0143]FIG. 6 shows the sequence alignment of some members of the DnaBprotein family together with the sequence of ORF1293b. Sawaya et al.have shown how several conserved motifs and functionally importantresidues of the DnaB family relate to the crystal structure of thehelicase domain of the T7 helicase-primase (Sawaya, M. R. et al., Cell99:167-77 (1999)). The alignment in FIG. 6 shows how these conservedmotifs are present in the ORF1293b sequence thereby supporting its roleas replicative helicase.

[0144] The bacteriophage T4 replicative helicase sequence was indicatedas most closely related to the ORF1293b sequence in the similaritysearch. The structure and function of the corresponding helicases may bevery similar in these two bacteriophages and, together with thesimilarity of numerous other components of these phages, may beindicative of other similarities of their replication machinery. T4replicative helicase is known to be an essential protein in the phagereplication and interact with other proteins at the replication forksuch as the primase to form the primosome (Nossal, N. G., FASEB J.6:871-8 (1992)). Similarly, the helicase encoded by ORF1293b may have anessential function in bacteriophage RM378. Other homologues ofcomponents of the T4 replication system have been detected as well asshown above and still others may also be expected to be encoded by thebacteriophage genome.

[0145] Table 6 sets forth sequences with strongest similarity (E-valuecutoff of 1.0) to the ORF1293b sequence together with the length andE-value according to BLAST search.

F. Subcloning of Selected ORFs from RM378

[0146] Plasmids were designated pSH1, pGK1, pOL6, pJB1 and pJB2, weregenerated for the genes encoding the 3′-5′ exonuclease, the DNApolymerase, the RNA-ligase gene, the RNaseH gene and the helicase gene,respectively. The correct insertion of the ORFs into the expressionvector was verified by DNA sequencing, and the expression of the geneswas verified by SDS gel electrophoresis of respective host strain crudeextracts.

[0147]E. coli strain JM109 [supE44Δ(lac-proAB), hsdR17, recA1, endA1,gyrA96, thi-1, relA1 (F′traD36, proAB, lacIqZΔM15)] (Viera and Messing,Gene, 19:259-268 (1982)) and strain XL10-Gold [TetrΔ (mcrA)183Δ(mcrCB-hsdSMR-mrr)173 endA1 supE44 thi-1 recA1 gyrA96 relA1 lac Hte (F′proAB lacIqZΔM15 Tn10 (Tetr) Amy Camr)] (Stratagene) were used as hostsfor expression plasmids.

[0148] Restriction enzyme digestions, plasmid preparations, and other invitro manipulation of DNA were performed using standard protocols(Sambrook et al., Molecular Cloning 2nd Ed. Cold Spring Harbor Press,1989).

[0149] The PCR amplification of the nucleic acids sequence containingthe open reading frame (ORF) 056e, which displayed similarity to 3′-5′exonuclease domain of family B polymerase genes was as follows. Theforward primer exo-f: CACGAGCTC ATG AAG ATC ACG CTA AGC GCA AGC (SEQ IDNO:64), spanning the start codon (underlined) and containing restrictionenzyme site, was used with the reverse primer exo-r: ACAGGTACC TTA CTCAGG TAT TTT TTT GAA CAT (SEQ ID NO:65), containing restriction site andspanning the stop codon (underlined, reverse complement) [codon 350 ofORF 056E shown in FIG. 7]. The PCR amplification was performed with 0.5U of Dynazyme DNA polymerase (Finnzyme), 10 ng of RM378 phage DNA, a 1μM concentration of each synthetic primer, a 0.2 mM concentration ofeach deoxynucleoside triphosphate, and 1.5 mM MgCl₂ in the bufferrecommended by the manufacturer. A total of 30 cycles were performed.Each cycle consisted of denaturing at 94° C. for 50 s, annealing at 50°C. for 40 s, and extension at 72° C. for 90 s. The PCR products weredigested with Kpn I and Sac I and ligated into Kpn I and Sac I digestedpTrcHis A (Invitrogen) to produce pSH1. Epicurian Coli XL10-Gold(Stratagene) were transformed with pSH1 and used for induction ofprotein expression, although any host strain carrying a lac repressorcould be used.

[0150] The PCR amplification of the nucleic acids sequence containingORF 632e, which exhibited similarity to DNA polymerase domain of familyB polymerase genes was similar as described above for the putative 3′-5′exonuclease gene except that other PCR-primers were used. The forwardprimer pol-f: CACGAGCTCATGAACATCAACAAGTATCGTTAT (SEQ ID NO:66), spanningthe start codon (underlined) and containing restriction enzyme sites wasused with the reverse primer pol-r: ACAGGTACCTTAGTTTTCACTCTCTACAAG (SEQID NO:67), containing restriction site and spanning the stop codon(underlined reverse complement) [codon 523 of ORF 632e shown in FIG. 8].The PCR products were digested with Kpn I and Sac I and ligated into KpnI and Sac I digested pTrcHis A (Invitrogen) to produce pGK1. EpicurianColi XL1 0-Gold (Stratagene) were transformed with pGK1 and used forinduction of protein expression. The expressed protein was observed withAnti-Xpress Antibody (Invitrogen) after Western Blot.

[0151] The PCR amplification of the nucleic acid sequence containing ORF739f, (which displayed similarity to the T4 RNA ligase gene) was similarto the procedure described above for the putative 3′-5′ exonucleasegene. The forward primer Rlig-f: GGG AAT TCT TAT GAA CGT AAA ATA CCC G(SEQ ID NO:68), spanning the start codon (underlined) and containingrestriction enzyme sites was used with the reverse primer Rlig-r: GGAGAT CTT ATT TAA ATA ACC CCT TTT C (SEQ ID NO:69), containing restrictionsite and spanning the stop codon (underlined reverse complement) [codon437 of the ORF shown in FIG. 9]. The PCR products were digested withEcoRI and BglII. Subsequently the amplified products were cloned intoEcoRI and BamHI digested pBTac1 (Amann et al., Gene 25:167-178 (1983))to produce pOL6. Cells of E. coli strain JM109 were transformed withpOL6 and used for induction of protein expression, although any hoststrain carrying a lac repressor could be used.

[0152] The PCR amplification of the nucleic acid sequence containing ORF1218a, (which displayed similarity to the T4 RNaseH gene) was similar tothe procedure described above for the putative 3′-5′ exonuclease geneexcept that other PCR-primers were used. The forward primer RnH-f:GGGAATTCTT ATG AAA AGA CTG AGG AAT AT (SEQ ID NO:70), spanning the startcodon (underlined) and containing restriction enzyme sites was used withthe reverse primer RnH-r: GGA GAT CTC ATA GTC TCC TCT TTC TT (SEQ IDNO:71), containing restriction site and spanning the stop codon(underlined reverse complement) [codon 319 of the ORF shown in FIG. 10].The PCR products were digested with EcoRI and BglII and ligated intoEcoRI and BamHI digested pBTac1 (Amann et al. Gene 25:167-178. 1983) toproduce pJB1. As for the RNA ligase clone, cells of E. coli strain JM109were transformed with pJB1 and used for induction of protein expression.

[0153] The PCR amplification of the nucleic acid sequence containing ORF1293b, which displayed similarity to the dnaB like helicase genes was asdescribed above for the putative 3′-5′ exonuclease gene except otherPCR-primers were used. The forward primer HelI-f: GGGCAATTGTT ATG GAAACG ATT GTA ATT TC (SEQ ID NO:72), spanning the start codon (underlined)and containing restriction enzyme sites was used with the reverse primerHelI-r: CGGGATCC TCA TTT AAC AGC AAC GTC (SEQ ID NO:73), containingrestriction site and spanning the stop codon (underlined reversecomplement) [codon 417 of the ORF shown in FIG. 11]. The PCR productswere digested with EcoRI and BglII and ligated into EcoRI and BamHIdigested pBTac1 (Amann et al. Gene 25:167-178 (1983)) to produce pJB2.Cells of E. coli strain JM109 were transformed with pJB2 and used forinduction of protein expression.

[0154] Deposit of Biological Material

[0155] A deposit of Rhodothermus marinus strain ITI 378, and a depositRhodothermus marinus strain ITI 378 infected with bacteriophage RM 378,was made at the following depository under the terms of the BudapestTreaty:

[0156] Deutsche Sammlung Von Mikroorganismen und Zellkulturen GmbH

[0157] (DSMZ)

[0158] Mascheroder Weg 1b

[0159] D-38124 Braunschweig, Germany.

[0160] The deposit of Rhodothermus marinus strain ITI 378 receivedaccession number DSM 12830, with an accession date of May 28^(th), 1999.The infected strain (Rhodothermus marinus strain ITI 378 infected withbacteriophage RM 378) received accession number DSM 12831, with anaccession date of May 31^(st), 1999.

[0161] During the pendency of this application, access to the depositsdescribed herein will be afforded to the Commissioner upon request. Allrestrictions upon the availability to the public of the depositedmaterial will be irrevocably removed upon granting of a patent on thisapplication, except for the requirements specified in 37 C.F.R. 1.808(b)and 1.806. The deposits will be maintained in a public depository for aperiod of at least 30 years from the date of deposit or for theenforceable life of the patent or for a period of five years after thedate of the most recent request for the furnishing of a sample of thebiological material, whichever is longer. The deposits will be replacedif they should become nonviable or nonreplicable. TABLE 1 Comparison ofStructural Features of T4 and RM 378 Feature T4 RM 378 Phage typeT-even, A2 morphology T-even, A2 morphology Family Myoviridae MyoviridaeGenome size 168,900 bases ca 130,480 bases Number of ORFs ca 300 >200Characteristic GP3, GP13, GP17, GP18, Putative homologs of thestructural proteins GP20, GP21, GP23 same were identified Arrangement ofAll of the above genes All of the above genes structural proteins are onthe same strand were dispersed over the and clustered in a region wholegenome and found covering 35 kb on both strands Representative lysozymeand thymidine lysozyme and thymidine enzymes kinase (on same strand)kinase (on different strands)

[0162] TABLE 2 Source: Accession #: Definition: Length: E-value*:Spodoptera litura AAC33750.1 DNA polymerase 603 9e−08nucleopolyhedrovirus (partial) Spodoptera littoralis AAF61904.1 DNApolymerase 998 9e−08 nucleopolyhedrovirus Sulfurisphaera O50607 DNAPOLYMERASE I 872 3e−07 ohwakuensis (DNA POLYMERASE B1) Xestia c-nigrumAAC06350.1 DNA polymerase 1098 4e−07 granulovirus Lymantria disparT30431 DNA-directed DNA 1014 5e−07 nucleopolyhedrovirus polymeraseLymantria dispar P30318 DNA POLYMERASE 1013 5e−07 nucleopolyhedrovirusBuzura suppressaria AAC33747.1 DNA polymerase 647 8e−07nucleopolyhedrovirus (partial) Sulfolobus P95690 DNA POLYMERASE I 8754e−06 acidocaldarius Bacteriophage RB69 Q38087 DNA POLYMERASE 903 5e−06Spodoptera exigua AAC33749.1 DNA polymerase 636 2e−04nucleopolyhedrovirus (partial) Spodoptera exigua AAF33622.1 DNApolymerase 1063 2e−04 nucleopolyhedrovirus Mamestra brassicae AAC33746.1DNA polymerase 628 9e−04 nucleopolyhedrovirus (partial) Melanoplussanguinipes AAC97837.1 putative DNA 1079 9e−04 entomopoxvirus polymeraseOrgyia anartoides AAC33748.1 DNA polymerase 658 0.003nucleopolyhedrovirus Sulfolobus solfataricus AAB53090.1 DNA polymerase882 0.003 Sulfolobus solfataricus P26811 DNA POLYMERASE I 882 0.003Human herpesvirus 7 AAC40752.1 catalytic subunit of 1013 0.004replicative DNA polymerase Human herpesvirus 7 AAC40752.1 catalyticsubunit of 1013 0.004 replicative DNA polymerase Methanococcus voltaeP52025 DNA POLYMERASE 824 0.010 Bombyx mori nuclear P41712 DNAPOLYMERASE 986 0.013 polyhedrosis virus Bombyx mori nuclear BAA03756.1DNA polymerase 986 0.051 polyhedrosis virus

[0163] TABLE 3 Source: Accession #: Definition: Length: E-value*:Aeropyrum pernix 093745 DNA POLYMERASE I 959 4e−20 Aeropyrum pernixBAA75662.1 DNA polymerase 923 4e−20 Aeropyrum pernix BAA75663.1 DNApolymerase II 772 7e−14 Aeropyrum pernix O93746 DNA POLYMERASE II 7847e−14 Pyrodictium BAA07579.1 DNA polymerase 914 2e−16 occultumPyrodictium A56277 DNA-directed DNA polymerase 879 2e−16 occultumPyrodictium B56277 DNA-directed DNA polymerase 803 6e−11 occultumSulfolobus P95690 DNA POLYMERASE I 875 5e−16 acidocaldariusArchaeoglobus O29753 DNA POLYMERASE 781 1e−14 fulgidus Chlorella virusP30320 DNA POLYMERASE 913 3e−14 NY2A Thermococcus P56689 DNA POLYMERASE773 4e−14 gorgonarius Paramecium bursaria A42543 DNA-directed DNApolymerase 913 9e−14 Chlorella virus 1 Paramecium bursaria P30321 DNAPOLYMERASE 913 4e−13 Chlorella virus 1 Pyrobaculum AAF27815.1 family BDNA polymerase 785 9e−14 islandicum Homo sapiens P09884 DNA POLYMERASEALPHA 1462 1e−13 CATALYTIC SUBUNIT Homo sapiens NP_002682.1 polymerase(DNA directed), 1107 6e−07 delta 1, catalytic subunit Homo sapiensS35455 DNA-directed DNA polymerase 107 9e−07 delta 1 Chlorella virus K2BAA35142.1 DNA polymerase 913 3e−13 Sulfolobus AAB53090.1 DNA polymerase882 3e−13 solfataricus Sulfolobus P26811 DNA POLYMERASE I 882 3e−13solfataricus

[0164] TABLE 4 Source: Accession #: Definition: Length: E-value*:Autographa californica P41476 PUTATIVE BIFUNCTIONAL 694 3e−07nucleopolyhedrovirus POLYNUCLEOTIDE KINASE/RNA LIGASE Coliphage T4P00971 RNA LIGASE 374 0.002 Aquifex aeolicus D70476 DNA helicase 5300.25

[0165] TABLE 5 Source: Accession #: Definition: Length: E-value*:Streptococcus P13252 DNA POLYMERASE I 877 2e−08 pneumoniae Lactococcuslactis O32801 DNA POLYMERASE I 877 2e−06 subsp. cremoris BacillusAAB52611.1 DNA polymerase I 876 1e−05 stearothermophilus BacillusAAB62092.1 DNA polymerase I 877 2e−05 stearothermophilus Bacillus S70368DNA polymerase I 876 2e−05 stearothermophilus Bacillus P52026 DNAPOLYMERASE I 876 2e−05 stearothermophilus Bacillus JC4286 DNA-directedDNA polymerase 879 4e−05 stearothermophilus Bacillus AAA85558.1 DNApolymerase 954 4e−05 stearothermophilus Thermus thermophilus 2113329ADNA polymerase 834 3e−05 Thermus thermophilus P52028 DNA POLYMERASE I834 3e−05 Thermus thermophilus BAA85001.1 DNA polymerase 834 3e−05Bacillus subtilis O34996 DNA POLYMERASE I 880 4e−05 Bacillus caldotenaxQ04957 DNA POLYMERASE I 877 4e−05 Deinococcus A40597 DNA-directed DNApolymerase 921 4e−05 radiodurans Deinococcus P52027 DNA POLYMERASE I 9564e−05 radiodurans Aquifex acolicus D70440 DNA polymerase I 3′-5′ exodomain 289 7e−05 Thermus filiformis O52225 DNA POLYMERASE I 833 7e−05Anaerocellum Q59156 DNA POLYMERASE I 850 3e−04 thermophilum Rickettsiafelis CAB56067.1 DNA polymerase I 922 3e−04 Rhodothermus sp. ′ITIAAC98908.1 DNA polymerase type I 924 4e−04 518′ Thermus aquaticus P19821DNA POLYMERASE I 832 4e−04

[0166] TABLE 6 Source: Accession #: Definition: Length: E-value*:coliphage T4 P04530 PRIMASE-HELICASE 475 3e−06 (PROTEIN GP41)Campylobacter CAB75198.1 replicative DNA helicase 458 0.003 jejuniListeria Q48761 DNA REPAIR PROTETN 452 0.003 monocytogenes RADA HOMOLOGListeria AAC33293.1 RadA homolog 457 0.016 monocytogenes MycoplasmaAAC33767.1 putative replication protein 276 .007 arthritidisbacteriophage MAV1 Aeropyrum pernix B72665 hypothetical protein 7260.016 Porphyra purpurea P51333 PROBABLE 568 0.027 REPLICATIVE DNAHELICASE Escherichia coli P03005 REPLICATIVE DNA 471 0.047 HELICASESaccharomyces NP_011861.1 SH3 domain 452 0.047 cerevisiae ChlamydiaO84300 DNA REPAIR PROTEIN 454 0.14 trachomatis RADA HOMOLOG HaemophilusP45256 REPLICATIVE DNA 504 0.14 influenzae HELICASE CaenorhabditisT16375 hypothetical protein 566 0.18 elegans Pyrococcus E71133hypothetical protein 483 0.18 horikoshii Cyanidium AAF12980.1 unknown;replication 489 0.53 caldarium helicase subunit Rickettsia Q9ZD04 DNAREPAIR PROTEIN 448 0.69 prowazekii RADA HOMOLOG

[0167] While this invention has been particularly shown and describedwith references to preferred embodiments thereof, it will be understoodby those skilled in the art that various changes in form and details maybe made therein without departing from the spirit and scope of theinvention as defined by the appended claims.

1 73 1 129908 DNA Bacteriophage RM378 1 cgggtctgct tttccttcac ggacccaattctccgtgaaa gaaatacgac attcatactg 60 cacctcctgg ttggtttaat tagggttaatgttatacctt ttcaggaact tcgatcgctt 120 taactccctc tgatgaagca cggtttccaccgcggcaaaa atcaccagca gcagaaacca 180 cgctcctatc agaagcagcg ccgtttcaaaataaacccac ttcatcacaa gccccgccag 240 tacgaaggcg ggcaaaaaac ctgttataacgtaaacagcg ctcatggttc acccctgagt 300 ctggagtgca aaggcacctg taatatccacccaaccctca tgacgaaata ccgtcttctt 360 gaccacggtt ccgtcgggct gctgctcctccaccgacacc ctcttcgtaa aggaaaacag 420 aggaatgata cagaaagcca tcagagcattgaccagaaac ggcatgaagt agggcgcgcc 480 tccaatcatg gcggcaccca gcaaaatatcctccgtctcc ccactcttct tgaaaaaccc 540 ggcggcaaga gcggcaatct cccacgcgcgagacatcatc tgatcgaaat caggaatttc 600 ctcgaacgta agaagctcct tcacccgattccactgatca tcaggaaggt ttacaacccc 660 cgcctccatc tgttcgggag tcggattgtgctgggtgaga ttcagaatcg tcatggcttg 720 tacctccgtt tgtttgttaa gtgatccacagccagtatac gcataaagcg gaaaaaagtc 780 aatcggtatt ttctttcttc atcttaatttcatttttttc cttgagggaa atatccgccg 840 catacatttt ttcggcttcc ttcagcacctcagagactct gctgaagatc tccctgagct 900 gaaccattga ccactccgga tcgtcaagcaccagtggaag cgcgtagccg tcttcatacg 960 tttcatagtt atcctcaaac agatcttccagcagacgatc cagcagtgca ggaatttcat 1020 atcggtaact cataactcct ccggcggttaacttatcggt aaaccttcac ggatgaaggt 1080 ctcatgtgaa tgaacacttt tgctcccggatacacttcat ccagaaccat aagcgccaca 1140 agcgaagcaa gcgtcatgta caccccctggatacctccca ccgtcatata atccagaaat 1200 ctgatcgtac ccgccgaaat ggtatcttcaagccccccat cccagataag atattccacg 1260 atggagggaa tctgaaccct gatcttttccagctcccgct cgatatggat atgcggatcc 1320 gaactacgcg ctgcttccct gcagataccttccgccagaa caagcaacgt ttcatttcga 1380 ttctgataaa aataattcag agcaccctcgaacagcgctt caggatctgt agccgcacgc 1440 ggaacgcttc tggaaacgtc tttcagaatctgcatcatga cgcacctcca ttttttccaa 1500 cataaccttc ttatttcctt ttcggttccacgcaatccca attaccacta ttaatctcca 1560 tcaagtcaga aacccgaatt caatttaaaacttttctgtc tgaaattccc ttataaccct 1620 taaaacttaa cactaccctt tcaacacaatcccaatcacc agtaaaaacc tacctgcatt 1680 agatctacta ctcccctttg aagcaaaaaggaaaaaacca aaaatcaaaa ttctataacc 1740 cctacaggat acgctcagct ttaagtcgcatattacccat tgggatttta gaattttaaa 1800 attttgtttt tctttaatct ccatagggtacgcttagcat tgagtcttaa tttaccattt 1860 gagggatttt aatttagaag tttttgtttttctttaatct ccatagggta cgcttagcat 1920 tgagtcttaa tttaccattt gagggattttaatttagaaa ttcaaaaatt taattttttc 1980 ataaccttga gtggcttatt tacctgtagagcgtcattca aaaaacaccc catttcaaga 2040 aaccttcaca ttgatctgtc gttttacaacataaaacctt taagtggtat atgatcagaa 2100 agcgtaaaaa atctgaacat atcggagcggatgcgattcc aaagggattg gtctatgatg 2160 ttttaaatct tgggtatacc gataagctcaaatcccgcat gattgctata ctgagcattc 2220 ttatctacca tcgccaccgg gaagatcacacctacgagat tgaaacagga tcgaaatgca 2280 agcgcatggt ggaagttaaa aaaggggagtcctggatcag cattccaacg cttattgagc 2340 gggtttacaa cacatttgga attaagcttaccagggagca ggttaaatat gcccttcgtt 2400 tgcttttaca gcatggtctg atttcggtaaaggaagcaac cggtggggtt tcgaaaggtc 2460 attttggaaa catttataca ttcagagaaacggattttga aggagagttt gtcgatcctg 2520 tggattttgt gagggaaaat agtgaaagtgaagaagaaat ctggtatgca gattatacgg 2580 aaagtcggta ttcgaatcgc gtaaccgtcagagaaggagc atttcatccg attatgaaaa 2640 gtaggacact tctaaaaacg catgtgcttagaaatcatcc agatagagaa aaagctacga 2700 agttttaccc gaaagagatt gttgtggatatcgaagcggg tggatatcgc gtagatgaaa 2760 cagagcggta cagacgcttt agactcttcgtgataaaccg cgctgcgaag tttgcgagaa 2820 agttcagagc acgctacggg gggaaagttgatatatgttt taccggaggt aggggaattc 2880 atctgcatat tacgggaagt gtgctcaatgttccaatgaa ccgcagtcaa ttcgacagga 2940 ttttgaaaga agcaattgtt cgtatgcttaaagatacgga actatggcgg tttttcgatc 3000 cttccacgct gaatcctttt cagcttgccggggttcgcgg aaaacttcat gataaggctc 3060 cttttgacga ctgggtgtat gtgaagcgtacctatcaaac gattaagccg ctcaaagccg 3120 gaagtctgct ttcgagtttt gaggaggcggctttctggat ttcgcgtagc ttcgtcagaa 3180 aagcggctaa aggcaatcca tttagaacgtacagtctggt aaaggaaggg ctactggaag 3240 gggagccgtg gagcgatcac catgcgggaagagatacggc tgctttctgc atggcatgcg 3300 atcttctgga agccggatac atgacggatcaggtgttgct gtttctgaaa gattgggata 3360 agaaaaacaa accttctctt ggagataagattatcgcgca gaaggtaaga tcggcgcggc 3420 ggcttcttgc gcgaaaagga aagcttaaagcaaacccttc tctacagctt ctctaattgt 3480 tttgaaaaag tgatagaatc tttccggggaaaagctgtat ccgatcatgt cggtgataac 3540 gctcatcaca ttgaaaaatc gttggacttcatccggattt cttctgtcga ttttgataag 3600 aagttcgatt atacgtttaa acgatttaggataatcgtac cacagaaagg aaagatatgc 3660 acccgacgat ccttcttcct cttcttttttcatgtaagaa cgaatatctt catacagata 3720 ccatatatcc acaatgcttt gagcgagtttctccataatt ctggtggcgg tgttgctgaa 3780 cgtattgata tactcacagg taacaacatacatattcttt ttgatttctt cgataatttc 3840 aacatgaata tttgtttcca gaagataaacagggaaagaa attgaaagtt tttcaagctg 3900 cacaatttta tgcagaaagg tgttttcgcgcacttcccaa tcccacagac atttcacagt 3960 cagatatatt tcatttctta taactttctccagttcgacg aaaatatacg atttattttc 4020 tataaagccg ggtaactctt catgaatgatgcggtttaag ttgctgtgtt ttttcatacg 4080 ggtgttatct ctcagcaatt ttcttttagcatttgccaca aatctctgat atctttcttc 4140 aaaatcttct tttttgaatt ggatgttggcttcattttgc aattgtctgg ttcttatagc 4200 aagcgtctca atgaacgttt tgattagaagtattgctccc ttagccatat cctgaatggt 4260 ggaatcggcg ggtaaatcca cacgaaagatttttcgagtt tcttcgtttt tgatcagtgc 4320 gacgttccat cccattctct tttccatgaaaaacctgagc gcccagacca gatattcgta 4380 gaagttttca gtcattttat tttaaatattcccttatctg tattccactt ccggagattc 4440 tatatggatg taaagtatat tttttcgtggtataaaattc atctgagcgt gcaccgcaat 4500 tttcaggtcg ttctcgctca aatgatgctcgccccacctg aacgatccga taaactgcag 4560 cgtgatatgg tgcgtgatca aatccgtagaaaactccagc ttatcatctg tttcgggtgg 4620 aagcacatcg gtagagggaa accgggatttcacatcaaga cgtgcaagca ccagataatg 4680 atggtaattc tcgacgtttg atgccggtgatatgtttttg agtttgtcgt ttaatgtgcg 4740 gattacttct ttcatttcct tttcgatggtggtatcctct gattcttttc tggagaaaat 4800 gtttttataa gagggtcctt ttttgatatgttctaccggg gaaatggctt taagcagtct 4860 gtaggcataa tgcaacgtgt cgtttatcatctttatgaat tttctgatcg caacgcttac 4920 aggtgtatcc tcagagatgc tcaggtagaccagatcgggg tgttttctca aatgataatc 4980 aggtggagag aagccgggat atttttccagatagtttttg atggtgtttg taagcagttc 5040 ctgatacgat ggcattttat tttaaataagtgttgataaa caaacaggct tttttcacat 5100 attcgaacaa ttcatctttg gaaagatggtgtggtttgta ctgacgatgc ataaacttat 5160 aagcaatatc aaacaatacc ctacccttttcctcactttc tttaataata tctataatgc 5220 tgtctatttc ctctttcagt tctctgtaaatggttcggta tcggtttatg tcctgctctg 5280 tcaggtgttc actattttca tattttatcccgaatacaga cataatacct attgttcctc 5340 cgataaattc cagtgttcca ttccgggtagaggcgcttac aaatgagttg cgaatcttca 5400 atgttttatc tgaaacgaaa cttttaaaacttaagtcaag gagcatttct gtaacaaaca 5460 gaggagcatt caggtggaaa atttttgcaaattttgaatt ttctcgggga aaccatttaa 5520 caaaaccgat tatcattata ccaagatgagcttcctttgc aaaaaccggc ataatgtaaa 5580 gatgaatttc gtttcgaaat tctatttcgtctgtgaagtg ggtgattaat tttttatcga 5640 tattattgta gtgtatagta tcgtctttaagttctttcat ttttttctga agtttcttta 5700 atgcctgttg aaacttctcc tcaatttgatctacagctat tttttcaact ccgatatctt 5760 cttcgaagtg tctgattttg gtaagtgatagttctatagt ctttttaagc gtatgaacat 5820 acatccgcgt aaaatctctg ggtaactgcttttcggcggg aataagacgc acttttacat 5880 aagaaggtcg tttctctata atttctacactggaaaattc cgatcttttt tttaaaatat 5940 tcaatgatgt tcattgcaag tagtgcttcaacaacgcctt ccatagtttt tttagctaag 6000 gttttttgtt tacagttttg tagttttcgttaattaaggt gtttaatgct attggttttt 6060 ttaactattc cccacgaact atctgtttcaatacacgata tctttccacc atatcgaggt 6120 ttataatatc cagcgctcta cctatttcatcaaacatttc gatcacgcgt tcatcctgat 6180 tgtttttgct gtgttcgata aggtttctgagttcaaaagc tccccgtata ccactggaat 6240 acagaaatgc gatttttgga atatcagggtgtccgggatt aataagagat agaaaatgtt 6300 caatgttttt tatgagttca tgaaggcgattatactgcat gttgaaataa gcgtgggctt 6360 ttgtcagcct gatgttttct tctatcatggggcgcacgta aaaactccca tgcagagcgg 6420 ttccgatatt ggtaaaaact gtctcgtgaaacagaatagc ggaaagtgcg gcgtttttaa 6480 ccggatagag agaacttgca ccaataaccgatatatcaaa cagagcagga aaaaactcga 6540 aaagactctg atcattaaag aaaatatgcatttcgttttt tcgatatacc agatcagggt 6600 atttatgtgt ggtgttaaat attttctgtattttctgaac cgtttctttc tccttattta 6660 ttttttcttt aaacttctca atagcctgctggtacttatc atttattttg ctgtctacag 6720 aaaacccgga ataaatggtt cgtgtcttttttataaaaaa ctcgatcagt tctttgaaca 6780 tgcgcggcgt ttcttttata atctcttttgcggttgcgtt ttcaccgatg tcaaggatta 6840 tggttacatg tttatcgccg gcgtctatgtttaccggata ctttttttga aatctgtaat 6900 actgctgaat tgcacttaaa atctctttacggtatttttt cggagtcata aggtgtcggg 6960 tttgatttta ttaaatcact caggtttttaagtcgtgcat gtttaaccca gttttttaac 7020 caccctgtta ttccaccata tgacttttccatctgatctt acgattcctc cgtatcccat 7080 gcggctcagg atctcattga tttttccgttttgaggaacg ttgagtgcac caaaatagag 7140 ttcagtaagt tgcttcataa aacggtttctatcctgattc agatcttctt ctatcatcat 7200 ctgaatgcgg gttggaaatg tatctacgatcaggtttacg acgtagactc tatcgctggc 7260 ttcccatctt gaaaggaaaa aggaatcatatcgaagcaac cggtcaaacg tttcgtcaac 7320 gacggttttg acaaatgtcg caagttttcgggtgaaaacg gctccggttt gctcgaaagt 7380 gataagcaac cctttgaaaa gcatttttcggagtgcggag aggctgtagg gaacgtcaaa 7440 atgaattccc ctttcgacga atccatatggcggctttcca aataccccct gaagtttcat 7500 tcggtgaagt tcccagccgc ttccaagaaattcgtcaatc tgaagttttt taagtttttt 7560 gagatcggag gagttaaggt gcacgccgaaatagttaagt gcgcccccgg tggacgcgaa 7620 aagggggagg ttgtaaaaat cttttggataatcgttttcc tttttgacgt tgagaaattc 7680 ctccggctcg atgatatagt agaggtgatacccgcgctcc aagatttccc gaagggtttt 7740 ttcgtttaac gggatttcgc tcataaggagtccgtttccc tccacagaag acacaatcag 7800 gtttgaggga tcaagcgttt cgattttttcaaggagctct ttcatacggg tatctgcagg 7860 gttatctgtt cgcggttaat ctgcacaacgattttgagaa ggtgtgtggc ttcgtcaaaa 7920 ctcacgtcta tagtatctat gtcgtagggttcgaggttgg aggcaatcag gttgaacagt 7980 tcatcataat cataattctc gaaaagaatgttgcgaatac cgatccctct ttctggatcg 8040 tagggatatt cccccggctc gatgaaaagcaggagtttta tcttatcgat caggagtttt 8100 accgggtcat caggaaatct gaaattcggtgcagtgtcgt tcagatagaa catttcattt 8160 ttgtttaaat aaatcctcga ggaatcttcaaataaagagg ggcgttaatg gatgaaaaga 8220 ctgaggaata tggtcaatct tatcgatctcaaaaatcagt attatgctta ctctttcaag 8280 tttttcgact cctatcagat cagctgggataattacccgc atcttaaaga gttcgtcatt 8340 gaaaactatc ccggcactta tttttcatgctacgctccgg ggattctgta caagcttttc 8400 ctcaaatgga agcggggtat gatcattgacgactatgacc gacacccgct ccgaaagaag 8460 ttacttcctc agtacaaaga gcaccgctatgaatacattg agggaaaata cggtgtggtt 8520 cctttccccg ggtttctgaa atatctgaagttccactttg aggacttgcg gtttaaaatg 8580 cgcgatcttg gaatcaccga tttcaaatatgcacttgcca tttctctttt ttacaaccgg 8640 gtaatgctca gagattttct gaaaaactttacctgttatt acattgccga atatgaagct 8700 gacgatgtaa tcgcacatct ggcgcgtgagattgcacgaa gcaatatcga cgtaaacatc 8760 gtctcaacgg ataaagatta ttaccagctatgggatgaag aggatataag agaaagggtt 8820 tatatcaatt ctctttcatg tagtgatgtgaagacacccc gctacggatt tcttaccatt 8880 aaagcacttc ttggagacaa aagcgataacattcccaaat ctctggaaaa aggaaaaggc 8940 gaaaagtatc ttgaaaagaa aggatttgcggaggaagatt acgataagga actattcgag 9000 aataatctga aggtgatcag gtttggagacgaatatcttg gagaaaggga taaaagcttt 9060 atagaaaatt tttctacggg ggatactctgtggaactttt atgaattttt ttactatgac 9120 cctttgcatg aacttttcct cagaaatataagaaagagga gactatgaaa gtactcgcat 9180 ttaccgatgc acctacgttt cccacgggggtgggtcatca gcttcacaac attatcaatt 9240 acgggtttga cgcaaccgat cgctgggttgtggtgcaccc gccccggtcg ccaagggctg 9300 gagagactaa aaacgtcgtt attggaaacactccagtcaa gcttatcaat tctccgcgag 9360 gatatgcgga tgatccggcg tttgtgatgaaggtggtgga agatgaaaag ccggatgtgc 9420 ttgtaatttt taccgatccg tgggcttaccacccctttat gcaacaactt tcttactgga 9480 ttatcgagcg gaatctcccg ctggtatattatcatgtgtg ggataatttt ccggctcctc 9540 tgtacaacat ccccttctgg cacacctgcaatgaagtgat aggaatttcg atgaaatcga 9600 cgatcaacgt gcagcttgcg aaggagtatgtggaggcgta tgaaatcacc atgtatcgcg 9660 atccggaggt attctatctt ccgcatgcggtcgaacccaa tgtattcaaa cgcatggatc 9720 gcaagaaagc acgtgaattt gtgcggggacttgtcggaga taggatgttt gatgacagcg 9780 tgatctggct ttacaacaat cgaaatatttcacgcaagaa tctgatggat accatttatg 9840 cttttctggt atacatgctc aaaaactacaggaaacatca ccttttgatt ataaagtctg 9900 acccggttgt accggtggga acggatattcccgcgtttct tgccgatatt aattcgtttt 9960 tccactaccg ggatattgac cttcgggaacacattgtttt catttccaat gacgaagtat 10020 ttcacaacgg cggattttca agggaggaaatcgcattgct ttataacggc gccgatgtgg 10080 tgctgcagct ttcatctaat gaggggttcgggatcgcttc gcttgaggcg tcgctgtgtg 10140 gagccccggt ggttgctact atgacgggtggtattgcaga tcagtactcc ctctacgaaa 10200 tggattatga ggtggcggat ggaagtgatgaagatataat ctgcaagatt tatgaggaag 10260 tgcaccgtca ggtgctcaat cagtatctcgatatgctccg tcaaaacgga aaggatccgg 10320 aaagcgctcc ccgcaaaaat catatgatgcggatggtgaa accttatcgt cattatcagg 10380 gatcgccggc tactccctac attcttgacgacagggttcc tatccgggac gtattcccga 10440 agttcgatga agcgctggcg ctgaggaatcgtgaggatta cgaaaaactt tatgaagaat 10500 cggttgagta catcaccatg cacttcgatgtagaggtgct cggaaaagag ttcaagaaat 10560 cccttagccg tgccattaag aataaccagaaaaccacaag acaggttgtc gtgctatgaa 10620 gaagaaagtg cttcttgttt cgccgcttcgttccgttagc ggctatggaa ccgtaagtcg 10680 cggaatttat cgcattctga agcgaatggaaaaagagggg ttgatcgatt ttgatgtgat 10740 ggtattgcgg tggggtacgt tttcggaaaccacccacctt gatgatgaaa tcaagaagag 10800 aattcaggag aagtatgatc aggtgtacgatgttgcgatc atggtttctt ctccctacga 10860 ctatcgctac tggaacaaca tcttcagagcgaaacacctg ctctttttca atgcgatggt 10920 ggaaacgaaa ccgttccatc cgaatctgttccagcagctt ttcaacttca tgcttcaggt 10980 tcccaccgcg caccttgtgt ttccttcttccgaaatcaag aggatctggg aagaaatcat 11040 caattcccaa cccatccatc cggcaatgggtgctgcagtg ctctcccgca ttcatgtagt 11100 acccaacccg gtagatgaag tttactatacttcgaacttc gggaataaaa acgttcgtaa 11160 aaatgtgatc ggcgcgattc gaaagaagattgaggaaatc cgtcgatcct atgaactgga 11220 gcgggtgttt ctgacttttg cgcctatgggagtagatcga aagaacacca gggttttacc 11280 cgaacttatc gaaatggtgg ggcgggttggaattctggcg ctggcgggcg gaacaaattc 11340 ttttatactt tacgactttc agcggcttatctggatggaa ggtgagaaag cctataagcg 11400 gcttccgctt caccgatcga tcgacgttaccccggaagag cttatgttcg tttttggatc 11460 gctgacggtg gaagagctga gtgcggtgatggatatggtg gatggtggaa tcaacctttc 11520 gcatggagaa tcgtgggatt acctgttgcacaacatgatg ctactgggca aaccctgtct 11580 ttacgtcgac ttcttccgtc gggattatatcccttcggag cttcgtgatg tgctgggggt 11640 ggatttcaat atggtacccc tcccgaaggtggttcccaac attccgcacg atcatccgtt 11700 cttccacccg caaacgatgg tggcggaacccaatttgcag gatgcagcgg aaaagctcga 11760 ctgggtgttg cggaactacg gtgaagtctcaaagatgatt accagccata gagacgcttt 11820 caaaaccgac gatacgatct atgaatttctggttgacgca ctggagtcga tcgaagaacc 11880 acaggcggca taaaaatttc acattctggataaaccgggg gaattcgggc atttatcccg 11940 aaaatccccc ttttttgtct caaaaccgttttggcggggt agatatttaa tatcaccccg 12000 tggaaagttt aaccccaaaa caggagtggatatgtcgtac tatactgaag tcggcgcacc 12060 ctactttaca cgtgaagagc agtttgttcggaatttgctg ttcgacgtaa cttttaattc 12120 caaatattct ttcttcgatc tgacgctgcagcgtcgtctt acctttgagg aagtgctgga 12180 agaggtgctg gcggtgtttc atgcccgaatcgaggaagtc tgcaaaccca tttatcgcca 12240 gcaggcgcac cagtacgtgg agaagttcggcgagtatttc cgccagcgca agctttttcc 12300 ctcgatgcgc cttgtgcagt tttcgcgcatggttccttac aaccacaccc gtctttacaa 12360 ttgctcttat actcccgttg attccattgattcgatcgcg gagcttttct acctgatgtt 12420 gtgtggcgtg ggtgtgggat acagcgtggagcgtaaatat atcgaacagc ttcctgttgt 12480 atatcccgaa agtgaggggc agacaatcacctatcaggtg gaggattcga tcgagggatg 12540 gtgctcggcg ctcaagcgtt atctctatgcgcggtttacg cccaaccacc cgaagattgt 12600 atttgactat tctcttttga gaccggagggaagtgtgatt ggaaagcgtt acaatgctgc 12660 atttggttat actaaaaaca atcccatcaaagaagcaatc gaggcggtaa aggggatttt 12720 cgacaaagca gtaggaagga aactcaagccgatcgaggta catgatctca ttacaacgtt 12780 cggcatgatt atcaatcgtg cgaacgtgcgcggaatggcg gcgatcgtct ttttcgatta 12840 tgatgatgaa gaaatgcttc gctgcaaggatttcacgcgc ggcgaagtcc ctcagaaccg 12900 ctggtatgcc aacaactctg tcgtgttgtatagagacggc gataaacttc gcggagtgcg 12960 cggggaaatc gtcgatcttc gggatattttcatggaagcc tattgtggga agtctggtga 13020 acccggcgtc tttgtaacca acgacgaacattatcgcacg aacccgtgtg gtgaagcttc 13080 tctttatcgc aatttctgca accttacggagatcgccatt ccccgtgttc atcagagtga 13140 gatcgcggat gtgttgaaca cagctatcttcattggtgtg cttcagtcta cgtttaccga 13200 ctttaagttc cttcgcgatg tgtggaaagagcgcaccgaa gaagacaact tgcttggcgt 13260 ttcgctgacc ggcatttacg aaaatctggatgcgctcaaa gagtacatga agctttcttc 13320 gaaaggtcat gtcaaattca tggcggctcaatttgccggt tggttcgggt tgaacaaccc 13380 ggctcgcatt acgctggtca agccctccggcacggtgtcg ctgcttgccg gggtttctcc 13440 gggttgccac ccaccctatt ccgaatattttatccggaga aaccgggtgg atatgaatca 13500 catgctggtt gaagttttga aggattatccgtttatcatt gatgatgaag tgtatcccga 13560 taagaaagtg atcgaatttc cgcttcgggcgcaacgccac tttacgcacg atcccatgtt 13620 tcaggtgcgt cttcgcaacc agatcatgaggggctgggtg gaaccctcgc ataatcgcgg 13680 caaaaacaca cacaacgtat cgattacggtttatgtaaga gatgaagggg aagtggagat 13740 tgtaagtcgc gaactcaaaa atgagcgaaacatttcggga atcacgattc ttccggtggt 13800 tgagaatggc tataaactgg caccattcgaagcaattccc agggaaaagt atgccgacat 13860 gatgggcgaa atccacgtgt accttgatagaatcaaacac cagctaaacg gcacgcccga 13920 ctccccgcgt ctgaaactga tctccgattccgacgttttt gagggagaga aaggttgtgc 13980 cggtctgcaa tgctatttcg acatgtaacatgaaactcgt acttaaacac tccagagaag 14040 agtctttcta tcctgaaaca ataaaaactcttgatcatct tagagagaat gggtgggaaa 14100 tcgttctcct acaggataat cgttttaatatcatagaagg ttacgatttc gatatggtga 14160 ttaccacgtc gaaccctcaa tacagctttgcggatttcca caatgaagca ttgaaatttg 14220 ccaagcacgg ggagtggctt ttttatcttgatttcgatga atatttatgt gataattttt 14280 gtgaaagggt taaaaaatat atcaacagagatgttcattg ttacaacatc gcacgcataa 14340 acattataat tcctcaggag aaaacgggtgatgtgtgcgg gatgtacgga tggcgtagtt 14400 ttaatatcaa tatacctgag gaagggagtgtaaaagcgat aaatttcccc gattaccaga 14460 cgcgtctggt tcgcgccgga accggcaaatggtacgggaa cgcccacgaa cgctttgtgt 14520 gcgataatgc ttttaaacac aaaacgttaccgtttgatgg tggatatatt atccaccgta 14580 aatcttttga gaaacagatt accgataacgcgctctggtc aacctataca ccgtgatata 14640 tgttcagcgt aattctcata cacggaaacgaggatcttat caataaagaa ctgatagata 14700 atcttaatga attcagggaa gcaggatgtgaactcatttt gctgcaggat gatcgttttt 14760 caccgcccga ctttttcaaa tttgatattgttataaaaca ttccgtttcc gaagggatgg 14820 accgtcatcg aaattttgcc aatcaacatgcttcttttga atgggtgttg tggttggatt 14880 ttgacgaata tctattcccc ggatttacagaacgagctcc tgaatacatg aaaagggata 14940 tatgggggta tggattttac agattgaacatgatcgttcc acctgaaaaa acttcatggt 15000 tcgttcagaa ttatggctgg tatgaaatggttgggtgggt ttcaaccata tcgatcaggg 15060 gggtttctta tcaggctata aattacccggaggttcatta tcgttttgtt cgaagagatt 15120 gcggcaagtg ggttggtaaa agacatgaatactggtattc aggtgatttt cgtaaaaaag 15180 ccatatttcc ggcggatcga gaaacacttttccacgttaa acccattgac aaagcaataa 15240 gagacaacta taaatggagg gcactatgatgaaccccgaa atgaaagaga ttctgaagaa 15300 gcttatgaaa cccttccacc ctgatcgccattcctatcgc gttaccggaa ccttccggac 15360 tcgggaaggg cggaacatgg gggtggtggcattttacatt tcatcacgcg acgtgatgga 15420 tcggttggat gcggtggtgg gaccagagaactggcgagac gaatatgaag tgccggctcc 15480 gggggtgatg aagtgtgtgc tttatttgcgtataggtggg gagtgggttg gaaagagtga 15540 tgtggggacc ggcaacatag aaaaccctgaaagtggatgg aaaggcgccg cttctgacgc 15600 cttgaagcga gcggcggtca agtggggaatcgggcgttat ctctatgcac ttcccaaatg 15660 ctatgtggag gtggatgata gaaagcgtattgttaatgaa gaggcggtca agtcttttct 15720 ccataagcat gttaccgaac tgctgaagaattatcagtaa cccaaaccta aacccgaaaa 15780 atatatggaa acgattgtaa tttcccaaaacaatacgacg gagatgacgg aaccccccca 15840 gaacatttcc gattcggtta aaagcgggtttatctatctt atcgaaaagt ctcatttcct 15900 tgaaaagaaa aacttcctta aaatcatatcgaacatggac ccccgccgca tttccaatcc 15960 ggaggtgcgc gtggtggcgg agtacatatatgattatttc aaaagtcata gtaatttccc 16020 ttctaaaaga aatctttgcc atcactttgagtggagcgaa gatctggaag gagaccccgc 16080 cgattatcag cgtatcattc agtatctcaaatcttcttac attcgatcct ctataacaaa 16140 aacgctttca tatcttgaga aggatgacctttccgcgttg aaagaaattg tcagagccat 16200 tcgggtggtg gaggatagtg gggtgtcgctggtggaggaa ttcgatcttg caaccagcga 16260 gtttaatgaa ctttttgtta aagaagaacgcattcccacc ccctgggaga gtgtaaacaa 16320 aaatatggcg ggcggtcttg gtcggggagagcttggaatc gttatgcttc cttcggggtg 16380 gggtaagtca tggttccttg tttcacttggtcttcatgcc tttcgaacgg gtaagcgcgt 16440 gatttatttc actctggagc ttgaccaaaaatatgtgatg aagcggtttt taaagatgtt 16500 tgcaccttat tgcaaaggac gcgcttcttcctatcgcgac gtttatcaaa taatgaaaga 16560 gcttatgttt tctcaggata atcttttgaagattgttttc tgtaatgcga tggaagatat 16620 tgagcactat attgcgctgt ataaccccgacgttgtgctg attgactatg ccgatcttat 16680 ttatgatgtg gaaaccgaca aagagaaaaattatctgctt ttgcaaaaaa tttataggaa 16740 acttcgtctc attgcaaagg tatataatacagcagtatgg agcgcctctc agcttaatcg 16800 cggttccctt tcaaagcaag ccgacgtcgatttcattgag aaatacattg ccgattcatt 16860 tgcaaaagtt gttgaaatcg acttcgggatggcgtttatt ccggatagcg agaactcaac 16920 ccccgatatt cacgtcggat tcggtaaaatcttcaaaaac cgtatgggtg cggtaagaaa 16980 gctggaatat acaattaact ttgaaaactatacggtagac gttgctgtta aatgacacaa 17040 gttaagacaa aagggcttaa agacatcagaataggtagaa aggagggtaa gttcacacat 17100 gtaaatacaa caaagaaagg aaagaataagaaatatttca gggcggaaca tgaacgcctg 17160 tttctcaacc ttattcgagc acttcaggttggggattatg ccgaaatcaa ttctcttttt 17220 cctcttgtcg aaaagcaact ccgatggatggtacgaaaga tagtgaaccg actcaatctc 17280 acttcacttg tttcatatta tgaccacggcgaatgggagc atgatattgt aagttatgtg 17340 ttctccaaac tcgataacta ttctcccgaaaagggaaggg tgttcagtta tatcagtgtt 17400 atcatagtca attatgctat caatttgaacaataaaattt attataaccg ggtggggtat 17460 cattcagatt tctatgcaga taatcctaccaccgaagact acaagggtct ggatgaaaag 17520 gaagagttga gttatgaaat agacgatcagattaatctga agattgattt tgagcatttc 17580 tgcaatctgt ttttaaatgc ttccgaagaaactttactca agcattttca ggaagacgaa 17640 gtttttattg ttaaaaatat tgcgctttctctgaaatatg atccggatat tatcacgacg 17700 ccttttctgg gggttgtaca tcggatgatctgtgagtttt gtggggtgga attttcccgc 17760 tataagtttt ccaaagtgtt caagaaaatggttcaactat accacgaagt ttttaacggg 17820 gggtaaaggt tatttaaata aaaaatatgttttcggcttc tgattataaa ggaaacgtaa 17880 cttttagttt tcacttccct tcgcttctcaccaatgccgg atcgcaccca aataaggcat 17940 atgtgtatta cgactatatg ggtagtgatctggtgttcac tttttctcga ataagattca 18000 gcctgtcggc acccggcacc tacgatgcttattttgacgc tcatattcag gatgttgaca 18060 ccattacctt cgattcaaac ggataccgtgagctttattt cattttcagc gtttcctggg 18120 aaggatccaa cacttcgggc accatttcgggtgccaatct tatcagcgta tcttcctttg 18180 ttactggata ccccgaaaac agttttcttgcctatacgct ttccgtttac tctgcttccg 18240 ccacaaccta tcttaacctt aatgatgcttacagaattta cgtagggaac attttcggca 18300 ccccgcaatg ggaagttggt tttaccggtagtttcacggt ttctgctacg ccttcaattt 18360 ctcacaaccg tttcaggatt ttacttctttctaactttga tagtgcactt aattactata 18420 ttactacgtt cagcgcacca gcattcgcctcacattcatt tcaggttatc aggaaaatat 18480 atgaagttga gccactttct gcttacacagtaccgtctat cgtgtttttc tacacggttt 18540 cagctactaa cagcttcggg tggagctattccaatataga aatggggtct ctttacagaa 18600 tatcaactat gtccattcta agttatccttacccctacac ggcaccggct ataacgtata 18660 tcactttttc tggcggaatt gtttcggatgaagaatttat tgtaaaggtg cccataaccc 18720 tttcttatat taacaacata ataccgtatttcatcggcaa ccccactacc acttcaaaca 18780 ttgacgatgt gaatgctact gaagataaaattatccctac ttcgataagt aactttaaaa 18840 caaccctttc atttcaggtt tttgcttttccgaacacact ccctgttaaa acggaacaag 18900 tatcaattcc cgttaccttc agtccggaaacgggcaacat ttctattcct gtttccatct 18960 catttcctgc gtttgtaaga actgctgcggctacaatgga taatccgggc aatttttcca 19020 cttctgtcgg aaatggtatc gtggttagcgatcttgtgtg tcagaataca gggaatatac 19080 ctattacatt tagtggtgtc agtcttgcaatagacgatgg taactggtat gtggacaccc 19140 cctccgtggg atatggtttt aacccgaacagcgggttttg gttcgatgtt cactttatgc 19200 cttatgggga tgtaaactac agtcaatccatttattttac gttttcgttc aattatccaa 19260 caaattatgg aaatatattg tcaggtagttttgttgaatc catttctttc catgcggttg 19320 ctacaggaac cgccccttcc ggtcaggtgggtattacggt gtccaactgg aatgtggaca 19380 accctaacac cgttatggtt ggtaaatatgttaccggttc cttcagcatc acggcaagtg 19440 ctacaaacaa tcagatcgct caggttaccctgacttcatc aacccccaat ctgtatttca 19500 cgacggtttc aggtgttggt attaacaatcttcatgctac ggcggtaaat tctctggcgc 19560 tacaggttgc tcccggagct tctctttctgtttataccca gtggtatatg aatatggttt 19620 atacggcttc ggctcctgat gtaaccatatcggtaacgtc ttctaatgct acggaaatga 19680 acggcgtgcc gggattgacg gaagttaagcgatcgcattc gctgacgaac cctgctcgat 19740 atgcaaattt gaatatagga attttttcactcagtgctta tggtcccttc tatcaatcaa 19800 ccgcctctat tttgccgttc ccttattcttttagtcttgg gggcatcaac gtcgttagaa 19860 atgttggttt ggcttggctt gatttttatccaacgaacag cactcattct gaaatgtatg 19920 ttaaattgac catgtctctg acaggatcggctttaaatgt tcatagcgta gtaacttcat 19980 cgtatttttc tgatccttct aatttcgagtgggaagtcaa cactttgcag catactctgt 20040 tcagcccccc ttatggatat tttcttcatattagaataag accgactcca agtgatatta 20100 acataatacc gacttcaagt gcatatggatatggtacgtt tgttgtaagt tggagcatga 20160 gtcttatttc ccatataaat ggggtaagcgtggcttctct tggacagggg tattcaaatg 20220 ctttgagttt gtggtttgat catactgttttctatgaagc accatagtaa tttcttatct 20280 atacgacaca tacttgataa aattgccgctttctcccatt tcaaaatatt ttctgagcgt 20340 agaaggagta aaatccgtgg cgtctccaagctttcgagtg ggggtgatca gtgttgcgtt 20400 gattttgaca tagtggcttt tgatcattttgttgtggggg aaaagcaggt tgtaaagcgc 20460 catctggttt acgtcgttca aaagatgttcatgccagaga atatcgtaag tgcgggtaag 20520 cagcgcaaga agattggtgt agtagttccgttcttcctta atggtgtaaa gcggcgtgca 20580 ggaaatgatg atcgcttgaa tttcctcacccggctttaat tctttaagtt taataaggtt 20640 ttcaatggta aacccaagcg gaatcacttctcttacccca ccgtctacat aggtgttgtc 20700 tccgatttta accggaggaa agaccagcggaatgctacaa gaagcaagaa tggatttgag 20760 aagaagttcc tctttttgct cttccgggatttcctgatct tcaaaaaggt agttaccgtc 20820 ttttacaacg attccggtgg atttgccgttttgcaaattc acagaacaat tgatatagat 20880 tttattgaaa ttcagaagcg ggagcacgtttttctcaagg tatttcccaa gaggggaaaa 20940 atcatacaga taatttcgtt tgagaataagtgttttgaga agggcaaacc actcaggctg 21000 ctgtttgtaa acctgtttcg gggaaagagaaagccacatt tgcttcatga gatcggtacc 21060 tttcggggta agcgccgcgc gggaagcacaccacacgccg ttgatacttc ccaccgaagt 21120 tccggctaca gcaagaattt cgttgtctttaagcgctcct tccctcacca gacaggaaat 21180 gacgcccgcc tgaaaagcac ctttggctcctccccccgac aggatcagca gttttttcat 21240 ttttaattaa ataatgctca ttttcccgatggaagcatgg aaatccactt caatttggca 21300 aatccgtctt ccgttttccc gttgatcatatatgcgtagg ctccaaagac gtgagctatc 21360 ttgcaatact cctcttcgtt gtcaataaagattgtatagt ggttgggtgg aatgattcca 21420 tagatgagtt cgtttacttt cccgatttttttgcccccga caatgcggtt gggaagtgga 21480 agattatgtt ttttcaaata cgactcgatgttttcccggt ggtttgcgct caggatgtaa 21540 aggcggtgat agttcggatt tctttttaccagatcgtaaa gataggtgta caggttatga 21600 aatttcgtaa tcgttccgtc gaagtctatacacaccgcca ccttgattgg tttgacaaga 21660 atccgggaga gcaatatatg agcggatttgtgcatagtca tagacacctg atccggtgaa 21720 agatcgataa tgcggggaaa tttgtaaatgcggcggagac ggttggtaag gtagcggatg 21780 tatctatcca ttcccatgta cttctcgataatatcaaggt attccggatt tttccttaca 21840 aacacctctt tcatcaggtg tttaatatgaatggtttccc gtcgggtgag aagaagtttt 21900 gttaaacctc tcacccgcaa ctcttcgagaatctccggag ataaatcttc gaactggaga 21960 taaagcgttt cgtcaatggt ctgcatattcatagtttact caggtatttt tttgaacatt 22020 gtattaatgg tgtcgatttt cttgatgtaatcaacgaatt tgacaccaag ttttcctgta 22080 acatttccga taagaatatt ggaagcgttcaatgccagtg cgggagtcag atttgaaagt 22140 cttgcaattt caaacagcgt cgggagaatatggtttttat tttcgatttc ttccatcaaa 22200 atggcgtcta cggcgttgta ttccaccaactttttatccg ggtagacagg aatctcatga 22260 tagaatctta cgtcgaaatc caccttaccttctcctattt cctctcgcgc aatatagtcg 22320 agccggtagg actccaactc tttgtatgccacaaaggagc gataaagccg catgtaatca 22380 aaaaacacaa attctacagg ggtacggggattgaaataga atggtaggtt tcgatcggaa 22440 attttccgca ccagcttcca gtccggaagcaacttatcac taatgacatt cacctcatgg 22500 atatgactac gaatgagcag gtagggataatcgaactgat aaccgttcca tgcgagcatg 22560 aaagtaaatt ttggtttcag cacattccagaaatactcga gcaatctttt ttccgaaagg 22620 aatgttctgt aatgaatttc aaatgtgttatcccctacgc tggtggtaaa tttgttaaag 22680 ttatcgatat gagcctccgg gttggtgataaggagaagca ctaccaccac cggttttcca 22740 tacggtttga tggaaatgga ataaactgggtctctccacg ggtcgggaaa gctttttttc 22800 ggggaaatcg tctcaatatc gataaagacgcactgagaca aagcttccgg cgtgatgtgg 22860 cttttttgtt ctctgatgta ataggatatagcctcagcct caatcttccc tcgattctgc 22920 tgagcgatgc gcttcaaatg tgggggtacgggtgattgaa ataagtgttt tttcccctcg 22980 attagctcca ctccgtaaat tttcatcgatcgggggtata cgcttgcgct tagcgtgatc 23040 ttcataattc tccttcaggt cttcttcgaggaaatcgttt aacgattgaa gcaactgata 23100 ataagcttcg cgggtttcga gcatgtcgaatacttgcctg tgaaaaaaca gaaaatctct 23160 tatcttgcgc gtggctccga tcagaagacggtgtttccgc tggaggatgt tataccttat 23220 gatataagta atcagcaccc cacttacggttgccgcaata gcaaccccca accagaaata 23280 tacttcctgc atggtttctt tttttcttcaaaaaaacctt tccgtgaaaa aatagtttca 23340 actggtaact gcaaacaaac ataaggagagagtcatgctc gacttttatc gctgctttgt 23400 caaaatcttt cagaatagct acttcgccaacccaacaaaa taccggtttg gcgaaaaggt 23460 cagagaagca gtgttcaact ggggagcacgcgtggcacac cacgacatca attcgcgaga 23520 aaccgaaatc gttgcagatc cggagatggatgattatttc agaagatcat ttttctccga 23580 aaacccctat atgcttgtta aaattacccatcccgatgaa tcgatgataa atacggtaat 23640 atggcaaagc aagcgatatg aaaacttttcccgcgtctat caactcattc gcacaattgc 23700 acagatgaga gaagaagaag tcgataactacatgaatcag atcatgccgt ttattgcgtt 23760 gaatctcaat acgatcaatc gctatatgaacaaaacaaat cttctctttc aaacccctta 23820 tgatgagtta tacggtttca ctctgcttttcaagtcggta attcgcattg ccgaagaaga 23880 aaacgaactg gagtatcttg cgaataaagatgtcattgat agttataata agaagattga 23940 ggaatttttc aataccgatg aaaatatcgctacatttgga tatgttctaa aagatatgct 24000 gtctcactgc attattgcca tcggtatgatcctgctggaa gcgaaggata aaacacacat 24060 gaagttttat gaggaacttg gtgagtttatggcggaaata ggtaaggtat acttaaaagt 24120 gatagaggaa ggtgagaaag atatgaatgcgctgacgcat ttatacctct ggtgtatgat 24180 tgccggttgt atcattaaca tgttgaacgtcaggattccg gatgaattgc ggttggctgc 24240 tatcatggtt gaagaaacgc ttgcctcgcaccaactgcaa ccctttattt cgttaaactg 24300 aagaggggta tgatacagaa aacaaccccgtataaaaact acaaaaagta catggatcag 24360 cggggagaag tgctgagacc gcacccccgcaagaaggtat atatcccatt tcttattgcg 24420 gaatgtggaa cttatctatg gaacgacataagaaacatga tgtttgcgct tccggggtgg 24480 aaagatgtgg tgaaaaaata cggtgtgggggaaaaatcca ccccggagcc tttctatgat 24540 ttcctttcgc tttttatcaa gaatactacgctttacagtg attatagaac caaacaaacg 24600 ctttttcaat cgcgaataga gcgcataaaaatggaagagg aagtctggaa tctttccaat 24660 gcactgatca atctgttctt ttatctgaaagagcattatc cctattattt ctcaaaagag 24720 tttgtctttt actttgacat taatttctatttcaggaagc tcacatttta tgatattctt 24780 gccggggaag atttgcggaa taaaatcaacgacacatttc agaaaatgct ctctaaaggt 24840 tacacggtac acctttcaaa aatgaaacctcagagtagag aagattatct atgtttgcgt 24900 tatgccgaat atatggaagc tattatggctcgagatgagt tcaagcagga aatggatatg 24960 aaagggagtg ggaatctttt ttatcttattgatggtttta aatgggggtt gataaataga 25020 aaagatgaag tagaatttgt tgtactggtaaggtaaaaac tatataaata aaaggggtta 25080 gtttatggcg agctggactt acgataccacttcgcgtatt ctgtcaatta ccgttagtgt 25140 ggtggatctc gacaataacg atgtactggtttacaccggt agcaattatc ctacatggtt 25200 gagtccgccg accacttcgt acgtttccggttcgttgtct ccaaagcagt ttgatgtgta 25260 tatcagcggt agcacgctca acgttcagacagggtcttat caggttgatt tgcttgccat 25320 tgaacagggt gtgtcgttcc cgctcacctcttcggcaagc ttcacgatta cggttacggc 25380 ggtttaacaa attttaggca agaagtctccatcctctaca gggtggagat gaattgccta 25440 ttgacaaaat tcagtggtgt attacaataaaagcaagatg tttagagcat acaaatacag 25500 gatatatcct aacaaaaaac aaaaagaacccttagagaaa acttttggtt gtgtggggtt 25560 ctactggaac agggcattag aaatcaaactcaaagcttta ggaaataaag agaaaatacc 25620 acaggtcttg cccgccttaa gggtggtagggtcggaacga cccgaactta tgcctgtgga 25680 ggagcgggta gctccgatga agcaggaagctccatcttct acaagatgga gtagttcact 25740 tcacagaaac tttatttctg ttttatcgttttttccgtaa aaaaaaagaa attatggttg 25800 taaaactacc gctgcatgat ttttaccctgaaggttcacc tttcaaaacc gaaaacttta 25860 cggtaaaaga ccccaccatt gaagacgaagaccgcctttt caacccggat cgcatcaagg 25920 ggggatatgc tctggatgat tttgtgagaggactccttcc cgaagaggct cagcgccagt 25980 acggaaacat gttcctcatt gacaggaatttcattctgta tgccgtcagg gtggcaatgt 26040 tcggagacac cattgaattt cgggaaaacatcgaatgttc tcattgcggc gcttcgcttc 26100 gggaggctac catagacagc gaggtttttattcccgaaaa tcgtaagttt gagttaaaag 26160 aagggggtta ttttatccgt tttaagttgcttaccgtttc agatcagaat gttatgagaa 26220 aagatccact catgaaaagc aactttctgacgcgcacgct ttattacgta atcgatacga 26280 ttgaaaaaga agagagcgac attaccgacaaatatgcgct tatccgttct attcctattt 26340 cacttggcac caagatcaga gagtttctgaatacacaata tcctcgattt gatattttca 26400 tcaaatgcgg ttcgtgcgaa agcaccatcccctttgagat gaacgaatcc tttttttgga 26460 ataagttatg attcagaaga agagcttgaaaaaatcgtgg tagaacggta tgaagcccga 26520 aggaaattgc ttctctttct gaaagaactggatacctatt ccagtttaaa aacgaaaatt 26580 tctatatcag aactccgggt aattgcctatatgtataccc agcaactgga agagcaggaa 26640 agagagttca agcgttttcg gggaccgcactgaagtcaag cgtggcgtag tcaaattgaa 26700 gggtaagctg cacgttcaca agacccgaagcatcggagaa gtcgagcgag tcgccgttga 26760 tgtcggcaac ccaggctccg tggaaagtccattgttcaat tacggcaccc tgaggatcaa 26820 gaagcagaag ctggatattt ttcttgtaaacatcctgata gccgtcgcgc ccggtggtag 26880 gatcgtggtg tgcaagtacc cactggtaaaccgccatcat ccccgattcc tcgattggat 26940 cataaagcgt caggttgatt gggttccagctaatttttcc cttatatttg aagtaggtgt 27000 taatgtggtg cacttcgccg acggcaaagctgaaattagg acgcgccgaa gcgtagacca 27060 tgtaggcggg aatcccgtcg atctgcatgaggaaaaggcg tttctgcttg ggttcaaaac 27120 gccggaaaag catgttttca acaacgcgtgccatatcgtt cctttttctt taaatatgta 27180 taaatcgttt ttcaaaaaaa tgacagggaaaaatatttaa agttgacaat taacaacaaa 27240 accggaaaaa atatgtatag ggtaaacgtaaaagaagtag acctttcgat tacccctgaa 27300 gtcgggacac cggtccaaac ggcgcttgtaggtgcgttcg atctaccgat tcccagcgaa 27360 cttccggtat cggtaacccc cgatgaattccgccgcgtcg gatcaaccga actcagtctc 27420 attgcagatt cgctggtggg tggtcaggaggttacggtga tcagaccgcg aggagaaacg 27480 caatcgctga atgcggcatt tgttgtggtgggtggttata atgtaaccct tggtgccttc 27540 aacgttttct atctgatgtt tctggggtatgatcctcaga aaggatatac tgatgtgtct 27600 tatgtagatg tgcaattggc tggtaccccaacggatacca ttctgttcag ctactcgctg 27660 gacggttctt cgacaacgca ttcacttaccataaatctaa acgcccccag tgttacgcta 27720 ccttctaata tcgtaccgct ctttttctactatgaacctt atacgggttc gattacgctc 27780 cagagttccg ttaactatag tggattaacactgaattata cggtcagcaa agcgaccact 27840 ccttgggtgt attttgctga atatggcacgccaacatctt ctcttacgct ttataaagga 27900 ttttatctgg aaggaattga cctgaacagctttaacaaac aatttgttgt atctatcgaa 27960 aatattacgg taaatagaga aaaaggtcaggtgctttatc cttcgtttga tgtggtggta 28020 cacttccggg atattagggg ggtcagtgccaataccgaat atattcgctt ccgtcaggtc 28080 aatctcaacc ctgaatctcc gaattatatcgagcgcgtaa ttggcaacat gacctttgag 28140 tttgacggtg agcgcattgt tacaggcggtgaatacccca atcaggtacc cttcctccgc 28200 gtggtggtct ctcaggatat taagcaaaacgtcgccgggg ttgaaaagtg ggttccggtt 28260 ggatttgaag gtatttattc tgtaggcgacttcactgtta ttgttaacga attgaccaat 28320 gtgtcaatcc cggttacgga ttcggctattattccgccca tgcggtttac ccgcattgaa 28380 cagattacgc tgtcgggcgg tgcttcgttcagcgtgatca gcaatcaacc gtatggtttc 28440 aatattcagg attctcgtca tagctactggctctcacctt tcaaagatga tgaactgata 28500 atcggaaccg aactggtact tccggctctggatgtttcaa cggaattcgg agtttcaagt 28560 tgggaagaag cacttcctga attcagcttcctgatgccgt tccagggcgg ttcagacgga 28620 tacattcgcg ttgatgaaaa tgagccggatacaatcgggc gcgtgaagat cactccggca 28680 ttgcttgcca actatgaaag gttgcttccgcttctgacgg aagatcaatt cgatctggtg 28740 ctcacgccct atctgacgtt tgctgatcatgccggaacgg tgaatgcttt catcaatcgc 28800 gccgaaaaca ggttcctata tctgtttgacattgccggag atgatgatac cgaaaatctg 28860 gctatttcgc ttgctggata tatcaactccagcttcgcaa ctacgttctt tccgtgggtg 28920 cgtcgtctga ccaataaggg aatgcgtacggttccggctt ctcttgcagc ctaccggagc 28980 attcgcacca ccgatccgga gacgggtctggctccggtgg gagcgcggcg cggcgtggta 29040 acgggcgagc cggtgcgtca ggtggattgggaagacctgt acaacaaccg aatcaacccg 29100 atcgttcgcg tcggaaacga tgtgcttctcttcggtcaga agacgatgct caatgtcaat 29160 tcggcgctca atcgaatcaa cgtgcgtcgactcctgattg ttatgcgcaa tcggatttct 29220 cagattcttt ccagctacct gtttgagaacaacaccagtg aaaaccggct tcgtgccgaa 29280 gcgctggtgc gccagtattt ggaatcactccgtctccggg gcgctgtaac cgactatgag 29340 gtggcgatcg attcggttac cacaccgacggatatcgaca acaacacgct ccgcgcacgg 29400 gttacggtgc agcccgcccg ctcgatcgaatacatcgata ttacctttgt tatcacgccg 29460 acaggcgtag aaatcacctg agaaataaacctttcaaaat ataaacccgc ctatcaaaag 29520 gggcgggttt ttttatttaa aataaaatgaagtttaacaa ctgggttgag tataccgacg 29580 acgtactccg acttgagtat taccttgagtacgaaattcg ccggtggaga tatcagtatt 29640 gtgatccgtt ccccactttt gaagatttcaaagaggcggt caaaaaagcc cctcgaatta 29700 tcgtaacgcc ggaacttgat aaaattataagaaatcgttc tcgaacccgc acgtttgacg 29760 aactgcttgc attgattaaa acttaccggggatatccgaa atttcgcaat gaaaagacgc 29820 ttcaggctat atatgacggg tttaaaaacaataaacccat gaaaatgccg atcgtgttgg 29880 agcttcccga cggaacatta cgggttatgtctggaaatac ccgtatggat gtggcattcc 29940 agctcgggat aaaccccaaa gttattctggtgaaggttcc tgataggtgc cattaatcca 30000 cactttccat atcaccatac tgatctacaatgtaaatctt gttgcagaat tctttaaatt 30060 tattcagcgg aaccactttg gggttggtgatagcccattc gtttacgata aatgcgtgga 30120 gacgcgatcc catttctttc aaatccctctggatttgttc aacttcatca tcccattccc 30180 catcacttat cttatggatt cctccacttggtacgggttt gccaagatag tgaaaaatga 30240 aagggtggga ggagtcgttt ttaagccgctgcaggatttt ctgacctgct ttcgaggaaa 30300 cgctgaacat tttcttttaa ataagattcataatcttcaa ttagcggaaa gtgttcaagc 30360 tgtttgagca gggtgttaac ttcataggcaaaacgaaagc gggagttctg gtagaagtct 30420 ccgattgtta ccggaattct gaaatcaggaatgtttttga tttcattgtc ttcaatattg 30480 aagacgaaat agcagtgaat gagcgggttttgaagctctt gcttgataac ataacgatcg 30540 aacattttga tatatttcca gatcacttcccggttgttca aatagatttc ttttgcccga 30600 ttcgggagaa actttgttat gaagaaatcgtcaagcagct cttcgatctc acttagtgca 30660 ctggtctgct tgttaataaa gcttttaattactccgttga aatccccaag cacacattcg 30720 gagctatcgt gcatgagcac acaatagccgaaaagcgcat cgctacacac ttcgcgagcc 30780 acgtcgtaaa caatcagact atgttcgagcacggaataaa aatattttcc tccgtttccc 30840 tgatagcggc aaatgttgga gagccttgcagcaacatctt caatggtaat acggtgaagg 30900 ctcgggtgca attcgagttt catggtgttatgactgtttg gttcacacga cacaatcctt 30960 aaagaggata aggttaaaag aggttcccttccttcaatta aaattcaaaa atgtcaatat 31020 caatgtcaag atcagtgtcg tcttcggttttacgttttcg aagaacataa tcgacgtaat 31080 ctatatggac cacaaagtag gaggtatcgtaaatctgaac caaaatcgca ttgacaaacc 31140 gttcaacata ctttttggag gaaaagaaagcatttatcaa aatatctaac atcatcttat 31200 aatacctatc gtctgtgttc attacatttttaacaagtac atctttaatt tcatctaata 31260 tagatttgtt ttgaggtata gtttttattgcttgatctgt gatagctttc aataattcat 31320 cgtcattaag tattgttatt acagttttccataaatttgt aatcatattg ctgttatttg 31380 aagagaaaaa gttgcgtggg tcacttaaatgcaaaactac acttggtata aataaacggt 31440 aagtattact tacattatct tttaatccattatcttttaa tcccagaaga gatttatagg 31500 tatcgataat tatagaaatt ttgtcaacatccattataat atctttgtat ttaagtatat 31560 tttctctggt ggcgtcatta aaatctcttgtcggggtacc aaacagcgtt ttaataatct 31620 catctttcag tttaggtata atctcatttaaaatttcctc tctttttgat tcgtattgtt 31680 ttactttgtt ttctataact aaagacgcgagaaatgtaga aaaaaggtca aatactgttt 31740 ttttggtctt ttcattatta ggtctcaccagatcttgata tttataatct ataaaaaatt 31800 tcaaattgtt ttctattttt tttctgtttatgttaataaa tttatctctg agcgttttat 31860 atacgacttc gttttgaaga tggaaatctcccacaagatc tatggctgac gaactttctg 31920 gtggtatgaa ttgaagtttg acgttcttcacaaggggaga atcctcttca tagttggctt 31980 ctgcaattct aaacgtttcg gagctaattatatctgaaat aacttccagt tgggactctc 32040 tgataaagaa ggaaataaaa tcttttattttatctttcaa ttgattccag agatcgtaac 32100 cgggtacgta ttcctgaaaa tctgtattcaaccatatatt aagtactttt cccacacttt 32160 ctattccctc ctttatttcc ttactattaacagataaacc aatgactttg ttaataacat 32220 tttgtgcaat agttttaact ataccctcaatgactttttc atccaaactc tgagagggtt 32280 gaagaatata gatgttagat acaaggttctggagtaatag tagggcttcg ggggaaacat 32340 ttttcatgaa tttatctaaa gtggagtaaagctcttcgag ttctttcttt gctttatcac 32400 tgagaccgat tatttccgct attttaaagttaatctcttt aataatgggc aaaggtagcg 32460 aaagtgtttc gagattttga ttgaactggtttttatactc tctgatatcg gttgatttgt 32520 aagtaatgac atgggcaatg acgccgctggtttcaattgt tccggtaaat gtggatactt 32580 ttattttatg gtaaaagtca tttctcgggtgtataaagag aataaaaaca taatcatgtt 32640 tataattatc ccaatagcta tcgttttgagcaatacagac ggtggtactt ttgttggtgt 32700 tggtttcggt taacatttct ttgattccttgataggatat ttcaggtaaa agtcgaacaa 32760 gtacggcatc gctatcttcc atttccggtttattatgata gacaagttct atgtccccgt 32820 tcttgatata ttttctgacg gcttccatggtgttgttcca gcgatccagg tagcgaaccg 32880 tataagaggg gaggtatgta tcgatgatctgctcgagatc gataaagctt ttaataaact 32940 tgaacttcat agaatgaagt tcgtcttttcttccctcctg actcaatttt ttattgataa 33000 caaaaagagc cgccagttta tctacattgctgagcatgtt atgataataa aaacgattct 33060 ccaccggaat atgactttca tccgatcgatttgtatacat tgctacaata ccctgtagaa 33120 taaaaacctg agcctgctcc ggaagaggtgtgttgtaggt ggtttttgcc gattgcataa 33180 ttcgatcgac aagttctttt ttgaccttatcttttacaaa actgaaaaac atacttttaa 33240 gctcttgctc tgtagccgtt tccggatcaatttctatatt gagtgtgggg tcttcgttta 33300 ttttttgtgc cagcttacgt gcaaaattaatatcgaattg catatttgta gacttttatt 33360 ttaaataact tttcgttttc gggtataaaaaggtctggtt ttgctggtgg attcctccac 33420 ctgaatgttc agcgagaagt tcggatcacgcggaaattcc tgatagtttt ccatatgcat 33480 taaaattttc aggtgatagt ttatttccgccacaaatacc agttcatcag acgacgggtt 33540 gatcatacga tcggaaattc cttctacgacaatatcccat accgccgcat tgtctttggt 33600 aaggataaga tcgggtctta cgtttgagagaatctgagtg atctcgcttt cttttgtaag 33660 ataataaaaa gcacgatagt tgactttatagggaaccggc accctgtact gaatggtgga 33720 ttggtgttga ttttccgtaa aggtaagaaaagcaggaaaa ttctgtataa cttcgattcc 33780 ttctcgcatg acaacaacga agggatactccactttgaac atatccgtta ccatcgattt 33840 cctttgcgcc tgagatttgt cgaaaataatgcgcggtttg gtgccaagag ccttttgata 33900 gatttctttt gcaaaaacta cggcaaagtaatcggctgta ataatttcgt tcattcttct 33960 tcagggaatg gaagttcttc ttcaccaccacccgtttctt cttcgaattc ttcgaaggct 34020 ccgccaagat taagttcgcc gcccagttcttctccgcctt ccgttccgaa atcgaattcc 34080 gttcttcctc ttggcgattc gatcggggagccgcgctcgc caaggaagtc ggcgggggtt 34140 gtttcctcac cgaatccgcc cgtgtcgaaaagaccgccgc caccggctgc ttccgccact 34200 tcctcctggg gcttgagatc gtagggaatctgaagaatgt tactataaat ccagtcttca 34260 cgaacccagc ctttgaggcg ttcggcaataccgattcgct gctcaatcac ggcaaagcgc 34320 tcaccttcca caatcgaatt cgagcggttcattaccaggc ggaaatcctg atcggcaaac 34380 tctttgttca tgcgcaccat gcgttcgagttcttccacaa agaacccctg aatgcgtttg 34440 atcgtgttgt tgaatttgat atcctgagtagccagtgtgt ttttagcatt cacgtctcct 34500 tcataaccaa tgaacgcctt tggtaccttgagtgcggaga tgagtcggtt gagcatgtat 34560 tccacatctt cagcaagatc tactttggaaccctgaagaa tatcgatttc caccgcacga 34620 cgatctccgc gccggggaat gaagtaatctttgagaatgc tttcgataga aaagtagtta 34680 tcgattccga gaaattgatt ctgattatttcttacccaat agtctcgctt atactgcatg 34740 gcaatattgg tcagatattc gttgatcttgtcgggcggca cgtttccgac atctacgtaa 34800 aacacccgtc tatcgacact acgaaccacacggtaaagca tgagcgcatc ttccatgagt 34860 cgaagctggt tccatatcgc tcgagcactttcaaggtagc ttctaccata ggggaagaag 34920 ttggtgtcga ttttgtgaga aaagtgaatgacatcttcct caggaatatc ttcgttaaag 34980 tatccgctta caacgttacg gtaaacgtcggtaataacat aataccaggt atccgtttcg 35040 gggttatatc gctttgagaa aatgtaaggagagaccacct gaaatttttc gatcgtgcca 35100 tccgaacctt tttcaagaat atgaagaaacatatctccgt atttgatcat gttgcgaatg 35160 ataggatagg cgttcttttc aatatttataacataatcca gataggagag tattgctttt 35220 gcaagctcaa tgtcttttgt taccacatccacaatattac cgttttcgtt gggaatcgtg 35280 cattcatctg caatgatatc cagcaccgtggaaataagcg gatcggtata atccatgcga 35340 tcgtacatat cgtagaggaa aaaccggttgaattctattc ctccgtagaa cctgctcgca 35400 taccccgctg tcgcaaacgg gtggtacatgttaatcggaa tcatggaaga gccacccgca 35460 ccgtgcggcg ctcccatacc atacatcggagaaaggaaat tggtgaagtt gacagcttcg 35520 ttcagttttt tatatttttc cagagacggcatattctcca cttttttgtt aaataacatt 35580 aacctaataa tgtaccaaat aacgaaatggtttcgtttat ttaaaagaaa atgacctatc 35640 gggaagccag agcacttttc aacaagatcaaaacactccc tgattataga aaccgcgttg 35700 tcattcggat gtctgaaatc agagaaagacccaccttcaa ccctcgagga caatataata 35760 ccacaccccc cggcacttat gcctatccacttggcttcgt actggacatc gggggtgggg 35820 gcgaggattt tgtcgatttt attgcgggtattatgctttt gccctacgct tcacatgccg 35880 aatgggtaca tatcttttac ataaaagacatgggttgttt tctgaatctt ggggataaag 35940 aggatacaga ggaattcctg agaaagtatgcagagaaaaa tccttttata aatactttaa 36000 tagagcacat tcgcatttat cagccgataaatgataatac gctctttccc attctaaacc 36060 gctatcttgt cggaatgcct tatgaaaacatatcaagcga agagtttcac cagagtttca 36120 acagggttct ggaaaagctg aaagaaggatacatagacat tttcaaaggt gtttaccagc 36180 atatcacccc agatgacgca cctgctgttgctttcgtgaa cgaattcaga gattttattt 36240 ccaatctggg ggattatcac actggaaaaaatatactgga agtggcaata gcccgaattg 36300 tgttcgccgt tttcagacgt catgaacttatagaaatgat cgaagcaatg atcggtaatg 36360 caccgggaga aattacctcc tcacgctttatcaactatct tccggtttct gattccagaa 36420 gtctgagtgc atttacccga tggtttgccattacacatcg cctgttttac tatgctttca 36480 ataaaggggt aatcagagag caatatcttgaagaatcggc tacgctgttt gtggatatga 36540 ttttcaccat tgccttttca aaggaaaaaataagagctgc tatggataca atgttcagaa 36600 tgttaataga tcaaatcaaa gataaaggtatacccaaatc ctatcgggtt tacagcgaac 36660 ttggttattg cggaatatac gatccgggaaccggcggtgt gcatgaagcc gaacctgctc 36720 aggtggtctg gtgggatccc tccgtggtggaatactacgg ggcgattccc aacataggga 36780 tgcgagaacg taaaattcag aacctgaaggattatataac cgcccttgac gtggtcagat 36840 tttttgtcaa ggtgtttata tacaataaacatttacttac acaagaaccc cgtttgttta 36900 atcaatcggc tgaggatatt gcttggcattttaaaagaat attttataag aaagaattca 36960 tttacctttt tgaaaaaggt ttgcggatgattagtagatt tatcaaaaca ggaaatgtaa 37020 atcagttgat gtctcttatt catgatgtactcatgttgca ccttagaaca gatctcctcg 37080 cgagggtatc tgcagtttat agatcatactctcttgaaga ttattataac gaagaactca 37140 aacatatgaa gagggtggta ggtgatattgccgataacat ggttgcactt cttacaaatt 37200 acgccgtgga tattctgacc ggtaaagagcaggttaagga tatagacagc gcattttccc 37260 attatctcga tcatctcaga gaaaaacttcaagaattgtt agataagtct gctttagagt 37320 tgcgcggaaa agcaggtaca aaaacactattgcaaagatc tttagcagta gagtcgggga 37380 tagagtctat tctttcagga attatcttcatgagaaagtt tctggaagct tatgattcgg 37440 atagagagaa gattgaggaa gcgttcagggtggtaaaaga aagactaagg gattaaatac 37500 tggtaattgg gattgtgtgg aatgggtatttttgaaaaga aggtgaatct gaaagagggg 37560 tggatccacc ttacaacatt tccgtagaaagaggcaaaaa ggggagaatg ctatgaagat 37620 caaaaaggta attatagcgc tgctgtttctactcacagcc ttccagcttg gggggattat 37680 ggcattgtat ctttttccgc gataagcgcctgtagctcaa ccggaaagag caccagcctt 37740 ctaagctggt ggttgtgggt tcgagtcccaccgggcgctc aggtgtaatc agaaacaaaa 37800 aaagggaggg agtcatgaca gtcatatgggcaatcttttt tatagtcatg gtgttgatgg 37860 aaattcgaac ctttcgggta aagaggtatctggaagatca ctccacccga caaggttctt 37920 atgcaaccga atggtattac cgggtggtgaatgaaaagga ggaacgtaaa aaaccgggtt 37980 cgcaatggga tttgtaagaa aaaaagagcgccttattatc aagcgtgatt tcgacgcgct 38040 taaatttgaa gacgcgttcg atcttgagatcgtgtttcac gtcaaccccg aagttgaaat 38100 tattgatcgg ggagaagacg tggttgtcgtatatgccccg cttggcattt tgggaagcgg 38160 ggaaacagtt gaagaggcaa tgaatagtttgcttcttcag gctgtaaagg aatataaaga 38220 gagcacttat gaaggagagc gagagatacttcgttccttt ataaagttgt acacgtcgtt 38280 tctcccgccc gactggaaaa gtcgggtttgagtaataggg cattcgtctg ctctcatgag 38340 taaccgataa ccaaacaaac ggaggtagccatgaaagagg tcagcgtcac ccatgtcgtc 38400 gtttgcccct tctgtggcaa gacgggcgaagtcaccatta cggcggatgg gagtggtccc 38460 cgcctcgtgg aaatggagcg catttgcccccacgtagata ctgaatacga cgaaagaaag 38520 cgggggattt acgtacattt cagtgacggcgaaagggggg actacgtctt cctatacgcc 38580 ccccttgcgc tgtacgtccg ggagggcgatccccatctga tcgcccgtgc gctccgccgg 38640 cggggcttta aggtacgggt cgacgggcgccacatcatct tcaagacacc cgtctacccg 38700 tatccggtgg acttggcgct taggcagtatatgcttaacg ccgggcgcac ggtctcatac 38760 aaacacgtgc atctgtgaag attatgtgagggggttgcgc ggcgcttggc attttcgtat 38820 attagaactg tcaccaacca aacaaaccaaggaggtagcc acgaaagcga ttgacgttct 38880 caagacattc ccagccccgg acagcttcgagggcgtctat tactgtccgg agcatccaga 38940 ggttgaaatc aaagaaaccg tccgttggacggaggttcca aaccccaacc ccgacgcccg 39000 caacccggtc gcagtacacc gggttgtggaccgctggtgc ccggtctgcg ggagaccggc 39060 tgttctggga gctcgatccg catgacgggtgtttcggcga tattcgcaat cttcgcaacg 39120 ccgaagcaaa gcttgcccgc cacattttaaagtaatttcc gtttatattt acttatattt 39180 acataggggt ttagaacaaa ccggaagatattatgaagtg gtttaaacga cttacgacgc 39240 tggagatttc ccttcttatt cctctctttatttccttgag cgtttacttc tccactcagg 39300 gagtcgccaa atttgtggcg cttcctgtgtgggtggtggc actggtaata gcggctattg 39360 acgtggcaaa gttcgtaagt gtgggtctccttgttaccac aaggggatgg ctgctcaaaa 39420 caattctgat tccggtcatc ctgtgcgccgtctttgccac ttctttcagt ttttatgcgg 39480 cacttgttta ttcacacgcg gagtcggtgtcttcagagaa agttgaaaac atcacagaag 39540 ctaccataac tcgtgaaacc gttcagcgtcagatcgcgcg ttatgagcag cttcttgagg 39600 aggttgaccg ttctattgaa aatatgaacaacacaaccac agagagcatc tggcaagaac 39660 gtctccgcaa gcgagagttg gagtcgctggtgaatcgaaa ggaggagtac cttgccgcta 39720 ttgactctct tgaagccgtt cttgtaagcagcacggtgga atcgaatcag cgtcaaaatc 39780 tatttttcct caactatatt actcccaacttctatttcgt gcttcttacg atcattttcg 39840 atccgcttgc cgttcttctt tacgcgctgtttgtgcgcat gctgaagcaa aatgcgcgtg 39900 aggaagatga aaaagaagtg aaagaggaaaaaacgggagt ggaggttgtg aaacctaatg 39960 aacccgaaga gcaggatttc gtttccaagcaagaggaagc ggagcagctg ctgatggata 40020 aagtttttca aaccaaacgc tttgcatttgatccaacccg aatgcaaccc cagaaggtgg 40080 ttatacggga aaaaaggagg aggtgatatgtacattgtaa aaaaagtcag gatattgagt 40140 gagacggcaa cggtaatcgt cgaatattcagattacaggg caaatgtatg ggttgggaag 40200 ggaatctcct gtagagcctt tctcaaaagcaaagaggtta gaacaggggt aatcccttac 40260 ctgaccattt acaaaagata ccccagaaatggaaagctac tggaagattt cttaaaatcg 40320 atggaacaac aatatgtaca acatacgcgtcaacacatat agtgtagggc tgcattcgca 40380 ccaagttccg attctcaaag cagccaacgatccttccatt gttgatcaca acatgtatct 40440 gtacattacc gcccgccacc cctttttgcggctcaagata gatttcacgt ttaacggcaa 40500 caaaagggtg gcgtcatcgg caattatttccatgcacaac agggggaaag atctgattaa 40560 agaatataag ctgttcgatc ttgatatatacaaaccgaca actgcttcgt ataaaccctc 40620 agataagacc aagactgtaa agttgatttataacttttaa atgataatag acgttgggtg 40680 attatgtatt gtcttcgata taaaatagcagatatacgtt gtgccgccct taacgtacat 40740 gcgtcgaaag tggcaccccc ctcctatgtagacatagtga ttaagggggt ttttaagata 40800 aaaaaagggg cgctcagtat cgcggttcatcctgatacac ctgtgggaga cataaggttt 40860 gattgtctca tgaaggttta tggaagcggagatgtgtttg aaatacactg ttttaaaatc 40920 atttttcatt tgaatgatat taaaaagaggtgttatcgga atcttttaaa gttggttata 40980 agttagtgta aatatgtggg ttctaaaacggcaagagcag gaaataggga taaagagtca 41040 ggatacgccg attctggctc ctattaatgctgaggtggaa atacacatag aaaagtatat 41100 aggcggattt ccgaagacaa aaggcttgtatgcagaagtg atctatgcgt caaaatataa 41160 caaaccggtt gtttttgcgc aaaccttaaatgcgaactac gaaatgtact tatgttctat 41220 tggtatttat aaaaacgtcg gaagacagaataacattata aacatcttaa aactttatgt 41280 aaacctgtaa caccatgtac gttttaaaaattaaaaaata cagctttcat accggatttt 41340 acaaaattcc ggcaaatggt atggtacgggatcctgagaa tgggtatatt gatctttgtc 41400 tcaaaacgga actcccgtta tgtgctttctttgtaaacta tgaggaagaa gatgaaccgc 41460 gcgtttttgt tataaaagag gctggaaaagatcctcagga aactatcgta gaatttattg 41520 taagtaaaaa ctttcccatt aataggaatttcaacataat caaactgata tttgcgccat 41580 gatggttgtt gccggtagaa attacaagctggaatccagc gaaatgctga ttcccaacgt 41640 ggtggttaca tccaaaaacc gaatctataacgtttcgata tgggttatag acatcggata 41700 tttctatgcg ggaaatgaac gggggtatcttggattaaga tgtggggttg aaaaaacgtt 41760 tactggcttt aaaattaatg tctataaaaccacaaatcgc gggaagtgat atgtatatga 41820 taagattgaa atgccacgat tatcccaatacggtcaacag caaaaaaatg gttaattaca 41880 aaataactct gaaatcagaa cacccatcaaacacacttac cattctgata aactgggttt 41940 caaccaatat cgaaagatat ggcaaccatattatgtttca gcgtcccggt tattacctga 42000 gcgctacgtt tttgtttaaa aaacatctttatttcaaagg cggttaccat ctacaaagct 42060 ttcgactgta aaaaatgtaa accgatatgtatctcataag acatagtctc aaaaataggg 42120 ttgcctatcc agaggatcct tactacaaaccaccggtttc caccggcggg aaatgggtta 42180 cgcatctggg aaagctttgt aaaatagaatttcacgcact ggttttgcag aaggaaatgt 42240 gggaagagat aagaagtagg aacaaatcactattcaacga tcggattcgc aaagtacttt 42300 tgtacgatac tgaagaaaac ctatttgccatatataagat aatctgatgt ttctgcttaa 42360 gacaacaccg cgcaatcaca atccgcgtcaggtatggttg aaactacctg accagagacg 42420 ggtgtttttt gaggtttcct acagattcgtagaaatttca catgcgactg gaaaccgtgt 42480 taacagaatt ctattacaac tcctgtcggaatatcatttt acatttgtaa aaaaggcgga 42540 ctatgctgct ggtcaaaaac agccacatcgatcccaatga tggtgaaatg cggctaaaat 42600 acagccgcgt tatggatgtt aaaatttatcttggggcgtt tgggaaatac ccaaaccccc 42660 gaagggtgtc ttatagtctg gcaccctttgatgaactgtt tgagtttgca agctggatgt 42720 cacttttgat gatagaaaag cacataaaccggaaaaagta atatgtacgt gtttaaagta 42780 agttatttta tgaacggcga gccgataggcatacgtaccc tttcaaggtg ggtgcaggtt 42840 gaaattgcct actggggtaa agaaggtacacgttataaaa gagttaccgg tgggagattt 42900 gaggaaaatg attactggta cgaaatagagataaaaaaat agaatgtgtc attatgtatc 42960 ttatgagaat gaataaaagt gtaccaataacacccatttc cgggaggggg agtacactga 43020 gcgggcatag cgagattaga ataggagccgcgtgttttcg cgccatgcac tggacttata 43080 taataagtgt acacataccg aacaatcagtttagtgtttg tttaatggaa aaaagagaat 43140 taataaacgt atttttagac aagcatataaaatgtacgta ttaagtacag gtgttgatga 43200 tccactattt atgaccggaa cttctacaccgggtgtgatc actcccaaag agggttttta 43260 tacaacccag aagtttattc gtgtgtggttttttgtacgc tactacagtg ttcccccaaa 43320 atcccacaac gttgtacatt ttaccagcgccaaccattat aaacttataa aaaaattcta 43380 ttatgtatac tattaaatta aacaaaggggttaaaaacaa cgaatgtttt gtggttgtcg 43440 gaaacgaaat tctctccaat gaccccattgtaaactataa tatatttagc aaacaggatg 43500 atctgttcgc atttacaatt caatactggcatagcttaag aacactggga ccagaaggca 43560 caccacttga tctggaactg acgtctaatgcgataaatct tggaaggatt tataacgaag 43620 aagatgaacc cttcccggat ttcatttttgaaaaactgat atataaagac tttcaagaaa 43680 gctctaaatt tgggtggtga tgtatataacaatcagaaaa caaatcgaat cggtagtata 43740 cgtagaacct gaattgctat atcatatgttcgtagaaatg ctgggatacg atgtggtagt 43800 ttatacgcta tatgccgccc aatgtaccaaatatcccgat aataaaacgg gggtggttaa 43860 gatgtttagt aaaaagaagg tgttttatgtgctgaaggtg ataaaagtga gcaggaaacc 43920 ttctttctgg aaacgtcttt tagaatgggtaaaagctatt atcagggggt gatatgtatt 43980 acctcaagtt gccggtagca aagcactcaccctttgattg tatctgggtg ttgtttatga 44040 tacattactt tcctgtaagt gtttctttaaacaccccgaa cgctgtatat tttaacatca 44100 aaaattttaa acttattaag agaatttatcaaaggttata atgtggaaca atcaactttg 44160 gggtgatcac aatgattgta cttaaaacaccgatactcag agttacttcg tggttagata 44220 ttagaaccgt tttgtacgtt gaggggattggatttgttac cagaatcccc tggatgtggg 44280 atattatctt tgaaattgtt tacgtttataataaaattga gcgtaatgct tgttattata 44340 ccaattacat caatttcact ttgaatcttgattcagtagg cggtaaagcg tttgctgtgt 44400 tgaaaggggt cgcaccagaa caggttttttccattattat ggtggttaga agatagaaag 44460 gtgtcatgtt cgtattgaaa atgcgtgttgtcgaaaagat tagagatcat tatgtacctt 44520 ccgactatag atcttttata cgtcttggtaactatacttg gttctatctt ttttatcatg 44580 acacccatga cataccgttg acaccggcgcataatacctt cccacaaacg tttgccgcca 44640 tgcagacgct cacggtcaaa tgcaagctggtcctctctaa ggagcagcga gaagcacttg 44700 acaccaccat gcgagcgttt gccgccgcgtgcaacgatgc aatcgccgtc ggtcgaagac 44760 tgaataccgc gtcgaacatt cgcatccaccgcgtctgcta cagcgacctc agagcaaggc 44820 atggtcttac agccaacctt gccgtccgtgccattgcccg agcagcaggc attctcaaag 44880 tcaagaagcg ccagtgcagt acagtacgcccgacaagcat cgactacgac gcccgcatct 44940 tctccttccg agaagccaac aagcgccgtggtctggaaga cgcggcaagg agactactac 45000 atcggtatcc acattaacgt agagacgcccccacctgaag atgagcacgg gtggattggc 45060 gtcgaccttg gaatcgcgag cattgccacgctgagcgacg gcacggtgtt cagcggcgac 45120 cagatagagc gggtccgtgc tcggtatgaaagaacccgcc gctccctcca gcgaaaaggc 45180 acgaggggcg caaagcgcgt cctgaaacggctctcgggaa gggagcggcg cttccagcag 45240 gcgatcaacc acaccatcag tcgccgtatcgtagaccggg ctatcgccga gggtaagggt 45300 gtccggctcg aagacctcag cggcattcgcaaaagtgtgc gcgttcgaaa atcgcagcgc 45360 agaagaatcc accgctgggc gttctatgatttgcgcatta aaatcgcgta caagtgcgcc 45420 cttgccgggg tgcccttcga gctgattgatccccgatata cgtctcagcg ctgtccggtc 45480 tgcgggcata ccgagagggc aaaccgcaagagccagagca agtttgtctg ccgctcgtgc 45540 ggattggaag cgaacgccga tgtggttggcgcaattaaca ttgcactcgg gggcgttgtc 45600 aaccgtcccg aagtagcgcc cgatgatgtcgaagcggtgt tgcatggtca gcgccgaact 45660 gagacggagg gcagctacaa gcccacgactgaagtcgtgg gtagttgatg aatatccata 45720 gccatttatt taatcaaaaa tgcttctcgaaagccgaaaa ggagaattcc tacaacagga 45780 aattcttcgg ttgtataaaa cctatggggatcgtcttctg gtaagatttt ccagcgccga 45840 acgcgaaacc ttcaatcccg acgccgactatttcacaacg cctatcggta cttacgccta 45900 tcctgtcggt gctatcttcc acatttcggaagacgatgtg gtgatcgatc ccgacatgta 45960 cggggtttcc gaaagaaaat atatttatttttttgtggca agtaaagatg cttcttggct 46020 taacatatcc tctcaacatc cggcgtttgaaattcccctt gttttgtaca accagttcag 46080 aaattatgcc gatctctatg acgtttcactggatgatgtt ttccggaatc gaaacagtat 46140 ggaaagctat cttacctact ggtgctttgccattgcatcc cgtgttttct ccgatcttac 46200 agagacactc aagcagaact tgatggaattgcttcgaaaa gatcttcccc gtatgcgggg 46260 atattatcag gagctttcaa atatttgcagggaatttgac gtcgatgttt caagattcta 46320 tcatgcacgt aacaatcccg aagaatggctcaatttgctg attgcagaac ttcttgaccg 46380 gctcaacagc ggcttcagac acatgaaatcagccggggat gtaaagcata agtatttcat 46440 gtatcctctg atcgttttta taacattgcttcacaacagg tatgcacctt atccgaattc 46500 cattgaagcg gcttataata taggagccaaaaaagaccct gttgttctga cgggtttcct 46560 tcgaaaggtg ggatatgatg gaatctgggatcatggcacc ggagccattc actccaatga 46620 acccgctcag gtggtctggt ggaaacccactgctgcaagg ctggtgaaca aaatggataa 46680 ccctctttat gtttcgcctt cctccataggattcggttat cttgcgtttg ccgatgaagg 46740 ggttgcaccc tccaatgaaa aacagaaaaaatatttatgg aatctgattt taagtggtaa 46800 aatggatgag tttattgaaa tcatggatatgatcatgtac cgtaagtatc ttgcagcgct 46860 tttcaacgcg tttttgaatg aaagacgggtggctctcaaa cacgctatcg gattcaaggc 46920 attcaaggaa tatctcaagc aaaatgcagaggaaatcaga aactttttca gagtgagcag 46980 caatgcgccg gtgcagcttg tatgggaccggttcagaaaa gcattcagaa tttctgaatt 47040 acttcgaaac tacgaagaat tgattgaccggcacccttat gaggtggatg attttgccca 47100 caagcttctt ggtaatttta actttttgaaagaactgatt aagcccacca gactataaaa 47160 cgcaaaataa ttaaaaaaat gaaagttaattaaaataaaa ggaggtcaaa atgaagaggt 47220 tgacaaaaga acagtttatt aacaattttcacgagcccaa ctcgctgcat ttgttcccat 47280 ctatagagga tttcattaac cctcgacaaggagatattac tcaatcctac tgttatgtat 47340 tacctgttca ggatttaaaa atcgacaacaaaatgggcat accggtaaat tttgatttat 47400 cacaggcttt aaataagatg ataggttctaaaggtgagtt gaacaaaaat ttgatcaaac 47460 aaaaaaactc ggcacttaag gaattaaaaaatatattaca gaagtttcac aaaattttac 47520 aatcattaaa atctaatttt aatgaagggatagcactggt ttttcattcc ttttttttta 47580 tgaaaaagtg cactcctttg atcatgctcgcgcatcgtat gattatgtaa aaagcaaccc 47640 caaaagtgtt ttagagccac tcaatgaagcattaaaatac gatgaagaaa tcgtcgagga 47700 agctattaga gaaacagtat cagattatctggaaagtgga gactggtatg atatgattga 47760 aaatgcagtc gaaaagtatt tgaggggttaattaaaagaa aaccatgctt gatcagcttc 47820 tttctctttc cgggctttac tttgatcaacagctttttgc gggttcaccg ggagagttgt 47880 ttttgcggtt ggtggcggaa gcactcgatgaagcggagtt caatgtaagg agtctgcaga 47940 accgaagcta tccgctgact gtagagaatactgatgatct gctgagactg gcacacctga 48000 acggtgtaag tattaccccc tacgttcagggaattgtcaa agcagaactt cttgttactt 48060 tccccatttc ggttaccaca tctgttcctgacttgacaac acatgcaccg gaaattctct 48120 acatggatat tcttgccgat acggattatttctatctgga ttataccgat ttccgccaga 48180 ccgatacccg tatcattacc accagcaccaatcttatcta ctcaagagac gtagtctttc 48240 gtcatggtag ggttgagcgg agaagctatccggtaagtca gacgatcccc ttcatgatgt 48300 tagaacttga ggaagatgtg gtggatgttaagaacgtttt cgtggaatac cctgatggaa 48360 ggctggtcaa gttttaccgc tcacgtaatcttcatgaaaa tctggtggtt gaaaatgcgg 48420 taatttacaa cacccgccac atttacgacgtggtgttttc ctcggggaga gtgcatttgc 48480 ttttcggtag aaagatttct ctggaagacccgatttcaca taccggctat acttttcccg 48540 ccggaagcac catatatgtc gacacggttgcaatcgatcc gactaccctg aacagcttca 48600 ttccggaact tgaagcagat atcaaaaccgttaagatcaa caaccgtatt ggggcaaccc 48660 ctcagattca ggtgctcacc gaaggtggatacacttcccg tcttaaagac atcgaatatc 48720 tcaaacggga actgcttgtc gctcttcagaaagacgaact ggaaagagaa atcgcaaaat 48780 atttcgataa atacagattc gttcgagaagatgatattgt ctatgtggaa ggagccatat 48840 accgcaacgg tagattcacc ttccacgaagccgatcgatt ctatatgcag aaagtggttt 48900 ccacctacaa ccgcaacatg atcgtcaggaaaattccgat cacccccctc aagatcatca 48960 ttcgcgcttc caacattctg aaccccggagaactgatcac tttcgtaaaa gattatatca 49020 gaaaacttcc gatcggtgga acgtggatcacaaatgaact tgtggggtta ataaaagaaa 49080 aattcaatgt cgtatgtgtg ctggaaatttattttggaga aacttatgcc cgaaaggttt 49140 cggaagatat tatcatctac gacggcgtactcgacgttga aagtgtagaa gtcaaacccg 49200 tactggtttg atggggcgct atgcaaaaaccaagggaagg aaatttcaga actttgtaaa 49260 atcgctgctt gaatccacct tcaaaaattggagcttcaag acagcaatca tgggcgaatc 49320 aggttcagat gtcaagatat ttccggagcagattttttcg gttgaagtaa aacaccacaa 49380 aaacggattg atcagaaagg atgatatgccttctgaaacc gtactcaagc aagcacgcga 49440 gcttatccgt aaggaaaaca gtcatttctgtttgatcgtt ttgaaggaga attacaaaac 49500 cccacaatat tttgtgcttt atcgaaacggaaagctgaga aagctggaag atatatcgga 49560 gcttaaggaa attgtaaaaa gatataaatgatagttactt tgagagaaag accgtattgg 49620 agatatattt acctgttgaa aattccatagcggcgcttaa gcaaaaactg gcaaggttag 49680 ccgctgcaaa cgaaaccgca ggtggaacgcctggaccccc cattttgctg aactcctgag 49740 caaacttgcg catcatcttc gtcatgaaagaattgaaaga ttttccaaga agcgcatttt 49800 cggtagcgct actggtattg cctatataaaccctgtctcc atgtagatac atccgttcgg 49860 tggaaatcct cacctttctc tccgcacccattagcaattc ttcctgcatg gcaaggtgca 49920 atttttttgc agccacaagg attcgctcacgttgcagcag gtgcagtgaa tccgatctta 49980 ttttaacaac gtaaccgcca ctttcttcatccaccgactc attcttgaaa gatattttat 50040 atgatttcag atttacctga tcgccgtctattttaccata gattccatac ggggatttat 50100 cttcgtcgac agaagtggtt aaatcgtcagagacttcatc actgtacttt ccaagaaaca 50160 gatttccctc ttcgtcaaac catagcgcctgcattccctt tccgttgaga aaatattcac 50220 ccggaaacga tcttatccgt gcccgtttcaattgattaat cgtaacacca ctttcattgt 50280 cgttggtggt ttgaaggtag tttgacatgttgacggggaa ggggaaatac cagagccttc 50340 cattgatttc cacataagca aggagatcacccacctcagg ataaaatcct atgtgcacga 50400 aaaatggata ggctacacca atcacttcttcagaaatgtc tttgattctg accgccatgt 50460 atctttcggg agtatcgaca tcatctacttccagtaccaa tccgaatttt accggagatg 50520 aaaagaaaga tattttgctg cttttgttagaaaagaattc ctgagtgttg aaccccattt 50580 tttattttat ttttgttagt taaacatataaactatatag ttttctttta aataaaacac 50640 caaatgattt ttaacacttc atactattgaagatttttca gaatacgatc cacgacctgt 50700 ttccattttc ggttatcctt atattcaccaacaatctctc ctgctatctc gaataccgct 50760 tctccaaaat accacaaatg tttttccagttctacgtgca catgatattc gttcctttca 50820 atatagtctt caattaatcg tttgataaaatgatgtaaaa cgtattcacc ttcgaatctt 50880 tctttaacat aaaacaccac aatttctgctcccagagata ccgcatctgt acctctttga 50940 ctgataggtt catcataacg cattccaaccacaatactca taagaacttc tgagggtaat 51000 ggatgacctt catgagagtg gaaaagatagtggttagtta ttttagtcag cagctttttt 51060 accgtctctc tatccagaaa gtgtgtgtagttattgattc tatcttcaat atagtcaacg 51120 aattccaatt gcagtttggt gaatataaattcttcttggg cgtatagttt atcatataca 51180 aaatctacaa gatcatcatt ttgttcattgtactgatcaa taatggaagt aacaattttt 51240 tctatcttat tctctataaa tttatctacgttcaatgcgt gttttaaacc ttcctttata 51300 atctctaaaa tttcggggtc tgataaataatcttcccaca aatcatcttt aagcatatct 51360 ttataaacaa tgaaggcaat cgcatcttttaccggttcgg gaatgtgtgg aataaggtct 51420 tcaggttcat ttatgcgtat tgctctttcgatcaacaaac gaatataaaa atctatagat 51480 tcgggggttt gaatattaaa gtcaagtatcctatcaaggt ctgttctgtc atcaatgcca 51540 tattttagca acagtctctg cttataagatggaatagacg gcacctgttc cacaataaat 51600 tcataaagag gagaaccctc tttaagattgattacaaaaa ggtctctcaa ataattagca 51660 gccgtcatct ttactcgagc attgacaaacgccattttta taacgtgggg ttcattatag 51720 ttttgaacaa cattaaaaag tgaatggaaatgttcagttg attcggctac tattagtttc 51780 ccatttctgg ttcccactat cataaaagaataaaacaggg atgtgaaata ttccttgccg 51840 gttatgctgt atggaaaatt agactttcgtagtccacaaa aaacaagggc tacgcccctg 51900 tcaacaattg atttcaactt gtttaatatttgcgaacctt tggtttttat ctcattttgt 51960 ttatgaataa tgaagtttct aagttgcaaaaattgatctt gggtgaccct tgatgaatgc 52020 ttatagatat aattgaatgt ttctaataagtggtgattca agggaatgcc ttcttgggga 52080 tgaaaactaa gtttttctac gggtaaaacataacacctta aaaacagcga atccccacta 52140 tcgtttggta aaacatacat gaaccagttaattgaagaaa aaaagcttaa cgcctctttg 52200 ctgtggaagt tatccgcaaa ttgttctcttgtcagtatca gcatataacc taagcaaagt 52260 ttattataca aggtaaattt ggattaaataataaccagac ttctcagatg taattattcg 52320 tcttttccaa ccacatcaga ataaaacgtttctatttcgt taagatcttt gttgatcaga 52380 gtttcaattt gtttttttca gccccatgacacaccctcca ttttttggca tactaaacaa 52440 ggcaaaaccc agacagttcc atagccgccgacttatttaa agaaaaataa atgaaaaatg 52500 cttcaagtac tgaaagacac ctatttaaacagcgcttccc cgcataacaa ctatggagcc 52560 gacgaaattc tccggctcaa tgccacttccagcattgcat tgcagtttga aaacccgatt 52620 ggaacgggtt atgagattcg cctgtttgttgccgacgcgt ggattcccca tgtagaatat 52680 ctgggtgggg gaagctatca ccggctgctcctcaccgttt cgctctacag cttttctatg 52740 gatgaaggat atggaaccga agtagaaccgcttataagcc agagtttcaa ctatgcgtcg 52800 ctgtcaacgc ttcctttacc actggaagttcgcacggtaa gcgcatttat tcatctggca 52860 ccgctcaagc ggcgtatggt aagcattccacttacaaact ttttcaacgc cggaaacttt 52920 gttcttatcg aatcggctga ggaaatggcggtcaactttt tcagcagaca gacgcgcacg 52980 gctttcattc cctatactat tccgacagtatccttgcagc ccccggcgct ttcagacttc 53040 gtatacgata cccgcataga cgactacggagtatatctgc aggcttggga gcggaagatt 53100 cccattgcgg taaggggtta tctcatgcagacgctgtcat acatagacct ctcaaccgta 53160 tggtttgaag tatacgtgtt cgacatgatcaccggtgagg aacactatta tacatcgctg 53220 cttcccactc ccgttgggaa taactggtactatattgaca tgagccgtgt caatatgaaa 53280 agaacccagt atgtgagact caaaccggttggaagcacca acgacatttt cctttccttc 53340 cacaaccgct atctgagact atgaacacccaacagattat aaaacaggag cttgaaaaat 53400 gtaaaaacga tccgatttat ttcattcgtaaatatgtgaa aatccagcac ccgatcaagc 53460 gcgtcatacc gttcgatcta tacccgattcaggagaaact cattaacttt tatcatacac 53520 accgatatgt aatcacggaa aaaccccgccagatgggtgt aacgtggtgt gcagtggcgt 53580 atgcacttca tcagatgatc ttcaactccaactacaaggt actgattgca gccaacaagg 53640 aagccacggc aaaaaacgtg ctggaacgtatcaagtttgc ttatgagcag cttcccagat 53700 ttcttcagat taaaaaacgt acatggaataaaacctatat cgaattttcc aactattctt 53760 ccgcaagagc cgtctcttcc aaaagtgattctggacgttc ggaaagtatt acgcttctga 53820 ttgtggaaga agccgcgttc atttccaacatggaggaact ctgggcttcg gtgcagcaga 53880 cgcttgccac cggtggtaaa tgtatcgtcaactccaccta caacggggtt ggaaactggt 53940 acgaacgcac aatccgagcc gccaaggaaggaaaaagcga attcaagtat tttggtatca 54000 aatggagtga tcatcctgag cgagatgaaaaatggtttga ggagcaaaaa agattgcttc 54060 ccccacgtgt gtttgctcag gagattctctgcattcctca gggttcggga gaaaacgtca 54120 ttccgttcca tttgatcaga gaagaagaatttatcgatcc gtttgtggta aaatacggtg 54180 gagattactg ggagtggtac cgcaaacccggttattactt tatcagcgta gaccctgctt 54240 cgggtagagg ggaagatcga tccgccgtaggtgtgcaggt gctgtgggta gaccctcaga 54300 cgctcaccat tgaacaggtg gcggaattcgcctccgataa aacctcgctt cccgtcatgc 54360 gtcaggtgat caagcagatt tatgacgaattcaaaccaca actcattttc atcgagacaa 54420 acggtatcgg catggggctc tatcagttcatggaagctta cacgcccagt attgtaggat 54480 actataccac acagcggaaa aaggtgcacggatcggacct tctggcaaaa ctctacgaag 54540 acggtagatt gattctgaga tcgaaaagactcttggagca gcttcagcgc acaacatggg 54600 ttaaaaacaa agtggaaaca gcaggaagaaatgaccttta catggcgctt atcaacggtc 54660 tcatggctat cgctactcac gaaatcatggaagccaaccc tgaatgggaa aagattaacg 54720 taaccttcaa cagttatctt gggaataaggtaacccccag cacgctcgac atcaaccaag 54780 agtttggagg agaatttacc tatatcgccacaccgaaggt aaatcctgat ctgaacaaaa 54840 atctattaat tcaaaaaaaa tccgaagatttcatctggta tatctgaaaa cggctttcca 54900 cacaatccca attaccagta tttaatatccctctctgata tactcccccg ttatttaaaa 54960 gaaaatgcca ctgagtagag acatcataaatcgaatcaaa gagaaacagg atactctcag 55020 agagaatatt acctacagcg caaagcttctcaagaagatt acagaaacca accttcagaa 55080 attcttttca gagacgctta catgggggataagggaagcc aaaaaccttg tactggcaca 55140 acttcctcct gaatacagaa ctcaaaatctaaacaacccc acacttactc ttcactggtt 55200 taccttcaat ttcaatccct ttgtttacaaacgcgaagtt aaaagcaaac tttatgattc 55260 tccgactccc aaggtttatc ctcttaaaagccatgattat gggtatagaa cggagctttt 55320 gagtgggtct ccggttcctg ctcccaaccttcgctatatt gtcagataca atcctgaaac 55380 cgatcgtctt gaagctcgca cggtggatattaccaccgaa gaaggaatca gatatgtgtg 55440 gggtgcgtcg ggtaatattc ctcaggatacgctcgagttt acatcgctac gtggtcttgg 55500 taaagacgat atgatcgatc tggctcagagcggcgttccc tatgagaact cgctggtgca 55560 gcttttccga aacagagctt ccattgggtttcagtatgat gaagaccttc gcaaacccat 55620 tcaggtggat cgtatcaata tggaaggatttactcagaac gaatcggaga ttatcaatga 55680 ttatgttacg ttctatttca agagcgtagtgagcggctgg atatgtcagt tcagagcttt 55740 tatcaacagt tttggtgaat ccaccaacgcttcatacaac actcaggatt atatcttcaa 55800 catcatcaaa atgtattcgt atatcaatgtagagaccacc tataacattt cgttcaccct 55860 gtttcctatg agtaagcagg agctttcaaaaatatggggt aagctctcat ttctcaaagc 55920 acacctgttt ccggcaaagc gggtaacacccggcggcaac tttgtacctc cggtacttga 55980 agtaacgctt ggcaacgtct ggagaaaaaggaaggtgctt cttacttctc tcaatatctc 56040 attcggggaa gataccgtat gggaactggatccaggtatg caacttcccc agtggatcaa 56100 agtggatctg aatttgattt tgctgtacgaacagaatatt accacggaag actggcttca 56160 aaaccgcgtt aaaatgttcg attatacgacaaacaagccg ccttctacgc ttgccgcctc 56220 cgactccatg atcgatcccg caacaggcgtggcacttgac atttcgacgt tcaaataccc 56280 ggaacccgaa agttttaacc tgaaacttgcaaaactcgat atacttaaaa accttggata 56340 aattatgaaa gtatattctt tttcgggaacgcgacgcgct cagaacatag ccgtacagga 56400 atatggagat tactcctact ggcaagatatgctgcttgca aacggtattt actccggatc 56460 gatcattccc ccgtatgttc cgtcgctttccatttacacc ccggaggaac tcgagaaccg 56520 tctggtagat aaataccata ttcccgatctgaaatatttt taacctatgc tgataagaag 56580 cctgcaccct tccgttgtaa agtatatcagacaatttgct tcgacatcga cggttcagaa 56640 gatttccgca aggcttgtgt tcatggtgcgcgtgagagac gccgcacctt tcagagcgta 56700 caacattgtc ttaaacaaca taaatttctataccattgaa aacgaaatca ctcctgatct 56760 ccagtcgtac tacgattatc ttccggctccagctattctt tcggtggacg tcgatccggc 56820 tcctgacggg atatacggta tgatggcgcgtgccaccgtc aatgtgcgtt gcttttctct 56880 caaacaactt cgggaactgg agtggagcctgtttccggga attacggcgc tcattgaagt 56940 agtgcgcaca aacaatgaaa ttcccgtggattttatttct gatcgctatg tgcgaaatcc 57000 ttcgcttctg aaagacattc tttttagcccgcaatcggta atcaaactcc atgagagaga 57060 tgaaggcaat aggatatttt tccccggaatacttaaaaga acaaatgttt cgtataacaa 57120 caataccttt gacattacct ttgagtttagtaattttagt atagcttccg tatttttttc 57180 tcgaaactac gatattaagg atgtagagacggctcgaaaa acgctggctg gtttctacaa 57240 tgagcgctgg agtacgcttt ccagccagaagaaagtcaga tcgggtcagg atctgaacct 57300 tgacagaacc tatcagatgt tcggtggggggaataaagca tttcccgccg aaaagggtat 57360 tgaagtgggc gtgggtactc atttcgatacaggcgacaaa actttcgccc cttcgcttcc 57420 ttccaacacc ttcgagtcgc tggaatatattcgttttgaa gatttcctga aggaaattct 57480 gattccctat attcgggaca cctacccggaagatgttcct ccggaaatgg caattctacc 57540 gatcgacata gacaactcct atatgttcattcataaacac ttgagaacca acaacgtaga 57600 tatcattttc ccaaccgaat acatggtgttcgattctacg aatatgacgc cggattacat 57660 tatgggattt tcagattatg aggatcatgcagagtggttc gagaagaatt tcgggaaacc 57720 ttacacccgt cacccgattg gatcagttggtaaagtgggg aaagtgatgt tggctcgaaa 57780 gtatctttcc gaactgatcg gagaattcgaacgcggcgac gacaagccgt tcagtttcat 57840 tattgataga atcattcagg atataataaaatccacctat ggcttttctc agcttttcct 57900 gatgaaggtg ggagagcaat acgtcatttatgataataga cttctggatg tagagacgcc 57960 tgttcagcag gtggaaaaca aatcccgtcttgaaccggaa gaaatcaaga tatgggaact 58020 tcacgacatc agctatacgc tggatattcctgaatatctt gcgatggcgg taatgatgaa 58080 gcgtctttca gactcgctga atacctacgtcaacgatcca gtggatttcc ttattcccgg 58140 ttccgttgag gatgtggtgc tgaagacgcttaccggagag cgtgtgaaag gaaccgcgct 58200 ggaagatacc acggaaagtt cggatgtggttgttaccaag gtgaacctga gcgctgaagt 58260 aatccgtgca ctcatgaaca atcccaatttcagagcgctc atgaatgtaa tcaaagaaaa 58320 tgaatcgggg ggcaactacg aagccattgaaatagaacat attatagcaa aacacggaag 58380 ttatgataac gcttttgcgc tggcgcggctggcgaacacc cgctttgcgc ggggtaaagt 58440 gtggtatcgg gtaagaggcg atcagaaagaggaaattacc ggagagcttg taagaaaggt 58500 cgaacaggct tccagcttca gcgatctggttacgcacccg ttcgtcgatg tgccgaaatc 58560 tcaggtgtcg cttccggttt ctcccggaagatataccacc gcctgtggcg cttaccagtt 58620 tacggaaaca acatggcggt ggatcgagagagagtacgcc gatctgtggc gggagcttag 58680 taagaaagcg gatgtggcgg tggattccgccggaaatgaa atggtggtta ccggtcttcc 58740 acccgctacg gtatatgaat atcaggcggttgtcgacact accgttcagt ctcgaattgt 58800 ggttcctccc acccccgtca atcaggattacatggtggca atttatctca cgatcattct 58860 caacaacgca aaccttaccg aagaagagtggaatctgttt ttgaacgaag gattcgggtt 58920 taagcgtgag gaaatagtta aagaaaaacttaccacccat tttgcttccc tcagaaaagt 58980 caatctcaat gcttcaatca gaagagacgcgtttgagcgc aaaggaaatg tcagtacatt 59040 tttgagtata aaacataagg atctgagcgaaacaaaaagt gttaaatcta ttacatttga 59100 tgtaacgaag gttgacgata gatatgtagcctacattccc atgcacctgt caacctatta 59160 caaagtgctt ctttatatgg gcacgctcccggaaagacag cgggggaagg gtgctcagta 59220 tctgaccggt attacactca acataacggttccgggtaat tcgctctgga ggatttttga 59280 cacgttcaaa atagaaggta ttcccgaaatctattatgaa aacggctatt tcattgtaac 59340 gaaaatctcc cacaacatat caggcggaacatggaccacc ggggttacgg caaaatactt 59400 ttacacgggc aaaacgtaaa aaaaaactatgagcaagtac tttctaaaac caacttctta 59460 cgcttccgac gtttatcttg caccacacgttcccgaactg gaatacgttc caaaggaact 59520 gataaaaggg tttgacatgc tcctcaactggatcagtgca ctggaaacaa atcatctgtt 59580 ttacagcgca atcaactatc tggctaaagattaccatgta aagaaacacc gcgaatatgt 59640 gatccatttc atttatccta aattcaatctttcggaaaag gattatccag aaaaagatga 59700 agattccctt attatgcttc ccgatcacccttttgctcgg caccgcaaag aggaaatctt 59760 aaaaccattt aagggtagat atcttgcgtttaccgcttcc ggaagatatc agtttattcg 59820 atccacatgg aaacatcttg taatgaattatcacactcag aaaattaccg ccttttcttc 59880 gctaaatcag gattatcttg cgctgtgtcttgtaagggaa gccttaatgc gcgttaaggc 59940 aacggggaat aaacggtata tgaacctctgggagtatttt atagactacg gatatattca 60000 tttcgatgaa ttcatgcacc ataaacaggtagtatatgcc ctttcaatgg tatgggaagc 60060 tttccagaaa tttcctgagg ggcttcagagtgatgaattt attaaagaat atgaaaagct 60120 ctatcgctga cgagtttctg ttacataccccgtcgatttg atctgcataa tcgcttctct 60180 tctggtagag tcgtacagga tagaagtctcatgatccatg taacccaact ctccggcttt 60240 ttcatgaagg ataccgctca gcgtctgcgccaggttcgag ttgggaagaa tacccggcgt 60300 gatttcgata agatattttc ctccctgtttcgatccgttg agtttgccgc ccccaagacc 60360 aaattcccgc aaaacagatt tcaacctttcgacgttaatt tctttgcttt gagccgtaac 60420 gtatatcgag atcgtattcc gctctttttcctgctggtat ttgtggttga acaccacttc 60480 gatcttttca ggggtttcaa aagcagtttgcagtctggag acaaaacgct caagtaccgg 60540 aatgcggagc gcggctttta caagttgacgcgttcccttc ttctggataa ttttgtgaga 60600 gcgctgaatg atgaatgaaa gatgctgaagcacttccaca gtgaattcgg caatgctttc 60660 gtctacttcg gcaagctgat cgtcgggtaaattgtcggct tcttccagcg attgataaag 60720 tgcgttaatg tattgcctta cttcatcagggatatagaaa tgcatgaccg gctcaactct 60780 gaaaggtagc gtgccgtaaa aaagtacggcatgaccgttt tctctgttaa tttcgatcat 60840 ggtggtaccg atattctgta aataattaccttcaatcggc acccttcttt tggacgcccg 60900 ctccagagga atctccatct cctcctgttcgatttcttcc tcttcttcct cctcttcttc 60960 ctgagctaaa acaagcatgt catcttcttcttcctcgaca taagcaggct caccggtggt 61020 cttgaagaat tggagcgtaa tgattgccacctcttgcggc tgaggaagtc tgatcacacc 61080 accttctccc caccccggcg gcggctcaaaacccatgcga atcagcgctt gcatttcctc 61140 cggggttgga gccggaagtt ccatacccccttcggtaatc tgacagataa ttgaaagctc 61200 cgggggattg atgcgcctta cgtaaacgatgcgatcgtaa gggctgtacg caccgacgta 61260 ttcgtgctcc tgcaaagtgg aaaaataggaaagaaatgtc tgtattccct ttagcggatt 61320 cattttcttt taaatatatg cttcttcaggaacaagcgaa atggtgggag gagacgtgat 61380 tcgggaaaaa tctttaaatt cataatcattcccacgatac agtagatatg gagagattcc 61440 ccctacaggc aaaggtggtg tgggttttcttgggggagta ccaccgccgc cgggttcatc 61500 ttcatcgtca tccatatctt tgtttctttttccaccaaac agattttggc ggtattttgc 61560 ctcgaaaata atttcaataa tgtagtaaagaacaacgcac atgacgatac aggcgagcag 61620 aaaaacgctg acgacaaata tgtactggatcaggtactcc atgttacaaa ccgttcaaaa 61680 caacttttac atacgggttt ataatggcttacctccggca caattaaata agggtcttcc 61740 agagggttca ggtaacccat tcccctcaaatccacacttc cattcacaaa tcgtatgttg 61800 tattgcgttg gattaccgca ctcacaacgactggaaatgt ttattttaat gggattgtat 61860 ttttcttcga tctgtttcca taccggaaattctcttccaa gatagtcggt tctgagaccg 61920 ctgagcacca cttctatttt ccacgaaaaaataaattcga tctcttcagg tgttgcaaac 61980 tggaattcat caacggctat gagcgaacacctaccggttt taagctggag atattcggct 62040 tcataaaatg tggggttttg aatgaagtcggtaagattgt aaacgcaaga atgggtaaat 62100 ccgcttcgag atttcaaggt aggggaatagccgtaaatgc ttccgggttt aaagacaaga 62160 taatcgtcaa agttttctaa aagttttataagaaaatgag ttttacccga tgccatcgcc 62220 ccgttgatga cggtaacgga gcgggttgagcgttccttca gaaattttat aacagcctta 62280 tccagttcga tattgtgcag ggtggtatctccggaaagcg tctcagggaa atcgtacttc 62340 atagttgatt tatttttaag ccgaagtctctgaccacaag gggtttgtca tcggttctac 62400 cccagttgtc gatcagggta aaatcctcacccagcagatt gaattgccga atcagcctta 62460 cggtttcccg gattacagga tttttaagaacggtgaagta aaaagtgtgc ctatcttcta 62520 cgcttacgtt gtcaagcgtg agtctattaatggcatttcc aaatatatcg gcaaaacgcc 62580 tatcataagc agaaaccacc tcgaaaccctcgaacgttgc gtcttcccgt aagtagcgtt 62640 tccggataaa cccttcaatc agaacactataaacatcaaa gtcaatatca gcgacacttt 62700 caaaataggc ttcattgaca ggtgcgacaaattcggtgat tagaacccca ccttctttga 62760 aaacctgagc gtagtcgacg gcgatttcacttcctgacct acgcaccacc tcatattcgg 62820 taatgttctg tttgattcca ttgtcattatgagcaatttt caaaaccagt tcggtgtcgg 62880 gtattctgaa tacttctctc cctcttccccttttgacggg ttcaaggtat ttcttttgag 62940 ccagaaggta agccgcgcga agcgggttttcacttcgctg aaagtaagtc agaatatccc 63000 ggagcgtgtc tgtttctttt aaggttatcataggctaatc cagtgttata tcatacatga 63060 tttgtgcggc aaccacttct ttaaaatattcaacgaaatc tctgtcgtct ctaacgtcct 63120 ttaaatactt ttctatttcc ggttttacatattcaaaaaa ctcatccata ataacttgtt 63180 tgataaataa tgtaaaatga tccccctcttcatccatatt gagtttgtta atgataacac 63240 cataagcaat tcgtataaag gtttgtacatctattacgtg gcgaaggtct ttctgaataa 63300 ggatattttc tatcttatgc tttattggaaaatatccgga aagaaaccga atcgcttcct 63360 caagattttt gctgacaaaa gctataatatcgctctgcag ttttcgcaaa atataatgat 63420 aatgtgcatc ttcgatttta taataagccgcatctctgag ataatcgtat atctgatccg 63480 cgatattaag atgatctatt gcttcctttatggcttttgc tttgaaattt ctgtcaaaga 63540 aaattgggtg ttgccttctt tcctcttcttcaaatttacg tacaacatat tgtgcgatcg 63600 gatttaaatg atcttgcccg taatgataagccgcgaatct tatgacattt gaccagaaat 63660 attcgatttg gggaggaaaa agcacatttattagactggt tccaccgtat tcttcataaa 63720 gaaaagagcg aacacaccag cggataatgtcttcgttatt tttataatcc gacagaatca 63780 aataaacctg attggtggga tcgaagccgccaagtacatc tcgagtcaat tttgagattg 63840 tctctttcag ttcaaaagaa ggggtggtttcttttagcag gaatatcatt ccgtccccgc 63900 ggggtatttt accttcgtgg atgtaaaaatagtagtacat ttcacgggtt acaatgatat 63960 tccagagagg agtagccacc cattgataaacttcgtcatc ataccgcgcc cctataagta 64020 tagaacccat aaacagaaaa tctatatctcgatacttttc attgagcgtg cttacctcaa 64080 taatcacaaa aaaatggtct gtcatggacttcaattttct ttccaataac accaatgttt 64140 ttcctgtttt tatcaacgct ttccgtgtttcttcaagtag ttcacccgcc cattggtgct 64200 gtcttaattt ttcttcaatg aaatttttgtaaatctgttg gtacataaca tttgacaggc 64260 ttgttctggt aagatcctct gaagtcaatatgaaatagtt ggtgttgtaa taatatgggg 64320 tgtttggggg caaattggtt ttaataaactgttgacttaa cccgttgaac aacggtttat 64380 taagaacaaa ggtatcggtg tttatcgtttccattggttt ttatttaaat aaaaagaacg 64440 tatgagagaa ccttttctgt ttcgagatccgacaatcgaa agctttggaa gctttttatt 64500 ggaatacctt gacattcagg aagttcgtgttaaaaccgaa tttttcggcg gtaaactgca 64560 aaaactcaaa gatggttatc attttccggatgtaaaactt aaacccggta aagatgtcga 64620 aaagttccga actctgtgca acgcattcgggtttgatgtg gaaatatccg aaaacgggat 64680 aacgttcaca aaaagacagg aatattgttttatcgaggag gctctgaaaa aggcgacaga 64740 gaaatatcag attttcgttc ttgcaccaatagaagttgat cttgttttta catgttgcaa 64800 ccagatattt gtcgaatatg aaatatgagcactgttaaaa tacctttagc cgttaacata 64860 tacgacccca agggcgacga atgggaatttatctacagca actatgcggt agaagttgta 64920 ggaagtgaat atctggttcc ggttgtaacactgaaaaccg gatcggttaa ctatttcaga 64980 ttcaatgtgc ttctaaccta ctctcagaccgggtctttcc ccctttatct gaattttctg 65040 aacaaaaaca ccaatcagat caatgtagtttaccgaaata tcagttacag ttatatcagt 65100 tccagcaatg tgaactggta tcccacaagtatatccggtc ttcttggttg gtggcaagca 65160 tatcatccgt cacgtgttaa agattacatcatagaccgca ctgaaaacca gagccatctg 65220 gtaaaaattg aaaggtatac ctataatgatcagtggctta accctacaac aacattcgtt 65280 tctcatgaga gtaataggat aaaaatgatgcttccaatga atgatttgat tgataatcac 65340 gggaataact gggtgtcaga accccgaaattcttatgtag gatatgtttc acaatctcag 65400 aaattcctgt cgaaggaata cacttttttctatgtttttt cggtagttga aaaaaacccc 65460 tatgtaacag taagtgggga gccgctgataccgggtgctg catatcccgc cctttcaaca 65520 agctattact ctattattcc caagggtggcgaatatctgg ctggtttaca tatatttcgt 65580 tctaaaactt atagttctgt aaacgataaaatgaatacgg cttctcttat gattcttttt 65640 accacctatc ccgttataag tagttctacgtttgctccgg aatataaggg ggataatgaa 65700 aacgcttttt ccaatacaca atatcgcatacaccccgcta tagcggctat cggagagaaa 65760 gatttaaagt ctcattatgt tccgggaataagaatagtct atcatacaga atctacaatg 65820 aacccgggag ttcagcttta tgagctttatcttggttata agaataccac ttcactttat 65880 gaactggaag taacttcttc agatatagcacgttttgatg tacctaccat tgtagggtac 65940 cgcattaaac aaagtggtag cgttatttcttattctgtta ctttgaacaa tgaaccgccg 66000 gtatggtatg taattacggc aagcattccttccatcgatc tttctgatcc gatttttacc 66060 gatcatagaa acgaagccgg cattattatagggtcgctgt acgggtatct atatgattat 66120 cagcttggag atgtcggaaa tctttcggctatttatcggt ggggatccaa gggtatttac 66180 ttttatgaag cactgttata tacccgctcgcttgacgatg cagaatacca gcaagtgaac 66240 gaacaccttg ttaagaaata ccgattcgggctgtaatggg aagaataaat acgacatatt 66300 ttatttatct gtatttcccg cgtatagatataagcggtct tgataatata catattgaaa 66360 tagaaatatt gggtggcttt agttttacacccgtttctta tacctacaat acatctggct 66420 cttttattac aacagaaacc cccgttgtcagggtgatgga aaatcgcaca ccggatatat 66480 accttcatgt tgtgagttta agtgctttatatagtaattt cgacccctct cttcattctt 66540 ggcatatctg gcttgatttc acaaggcttacggcttctaa aaccgacggt caacctgttt 66600 atacatcgga tatacaatcc attcagagtgatatatctat ggaaaactcc ggaggctata 66660 cgtattatga aaatattatg aatgggcttcctatggtgcg aaccaacaat acaggattga 66720 caaaaaccgg tggcattctg acggatgatccgatcatggt agtcgcagcg gtttatatca 66780 gccaatccgc tacatattgt cgtcttataagctggggata tagtattaat gaagcatggg 66840 atgtatatgc tgagttttct ggcgcgttggtaagatttat atttgtcacc gatacggcga 66900 cggctgggag cggtcctact ataaccagtgactggttcag ttatcctcag gggtttgtac 66960 ttgccgcatg gcaagaggat gacgaaaccatgcatttccg gattatggat gaaagcggaa 67020 atgagtacga ttatcctgta attaccggacgcgggggcgg attttcaaac ttcagattgt 67080 tcgatattta ttatccaagt tacaactggggatttaataa ttatgtggga gaaatcattg 67140 ttcacaatga tatatatatg gttgaagacgtctttcatta tatggctttc aaatgggtgc 67200 cgggattaac cggaagggtg cggataaatcgcttgtggga aaatctttat aaacctgaat 67260 tatatacatc gctcaatagt gttgtacttattacaggctc aacatctttt accggttcta 67320 ttattaataa cgatccaatt attctaacttcaataaataa catagataca ctacaatgga 67380 acccgcaatt taccggatct attgtcaataacaacccaat catcctaacc ccggtaaaca 67440 acatagatac actacaatgg aacccgcaatttaccggatc tattgtcaat aacaaccctg 67500 ttttgttaac aacgataagt aacgtattacttttgatgtt taattaataa aaaaaccacg 67560 aaagctatgc cttattattt cgagtttaaagttagagaac tggatcttga accggtaagt 67620 gtaacgctct ctccggctcc aagttgggtttcggtttata aatacaacac ccagcctttt 67680 gaccaatttt acggaactta tgacattacagtgtttctgg tagcaaaccc acccccggga 67740 acaccggatg gtacctattc gatagggcttactttgagcg acgcgctggg cggaataacc 67800 acacattcag tcaatttcat aatcaacacttctggaacca ttacatttga tcctgtttcg 67860 gtgccggggc tctggggttg gtggcaacccggaaactggc ttactcagag cagtgatact 67920 ttcaatgatg tggctatatg gtatgacgcttctccggggg cacatcatct tacacttgat 67980 aggagaatta ctattttacc atggaatagtacagatgctg gaagtgctta tgtcggatct 68040 tacataaaaa cactttcgga taattcacttctgttttcat ggagccatgt caatcaccaa 68100 tttgccaata tgaattattc gtcgggggctgataactaca aacccgaaaa tgttttgatt 68160 acaaaagata cttcttttta ctccaatcagtactctattt tctttgttta tagaaatcat 68220 ctcgactggt tttctcatcg tataaccggaatgagattaa ctataaatca ctatgaatac 68280 tgggcaacca atatatggga ctttgatgttgaacggggta ataatcatct tgcaatgccg 68340 gtctattccc cggtggtgat taacagagcggcgccttata caaccgtctc ttatggatca 68400 tactggaatg acgattataa tcacgggtttgtcggcggct ggtttattgc gttctgtctt 68460 cctccctatg ccgctaatcc gtcagccagagacgcttatt actatgatga cgggggcgga 68520 cttaccacca tgagcgtatt caactatgcccccggctatt accagaataa tgttccgcat 68580 caaccttata ttaccatatt caaagttaataaatatgctt ctcaaacaga tgggtctctc 68640 ggtattcacc ctattaaatt gttttattacaccaatgaag aatatgcgtc tatgtcgcta 68700 attgaaagaa acaacaggtt cagcagatttgtctttacta aagatcagtg gaatgctgtt 68760 ggatatattg ttgaggaaaa tccccttatttccaacagcg ttgttatcgg ttattcctac 68820 acttacagca tttatttcaa cgaaacaacttccgttacaa aatctctgga agtaacattt 68880 tatgacataa atggcaattt cagacccccgacaacttatg cttatattga cggttcagac 68940 aaccagcagg catatataga cgtatatggtgggtttggca taggaacacg ttttgcgaca 69000 gctcagagtc aatattatgc caacaccggtactataggat ggagaactta taactttaca 69060 cccggggtgt tttctctctc tttcaaggaatgtctgtttt atacccgcgc attatggaac 69120 gaagcgcccc agatcatgga ttatcttatgaaaaaacacg gtatcccgtt tgtaagctga 69180 tatgctggaa tttacctaca gtggtacgttttcatacccg gatagtcaaa cactttccag 69240 tttttactgg attattaacg ccccgtctggaagtgttgtt acttattccg aaattttaaa 69300 ccccccgctt aaagaaatcc ctattgaagtaaccatttcc ctcgatacca caagtatacc 69360 gtcaggaaat gtaacatgga gtgttaacttttttgcatat acaaccacct ctattacagg 69420 agaagtttat ctttatattt ccaatatctcaggattggaa ccatatagca tatctatctt 69480 tctgacttca agttatgaga aagaagggctctggagaaat ctcgggttgg gtgaatcttt 69540 ttactgctat tcgctttcca ccactccgaatgtacgattt atcaaacaca ccatttctct 69600 tcagagtatc agtttgatac cagccggtggtagtatcaaa tgggaaaaac ccccggaaaa 69660 aacttattat tctttttcga ttttcgccaaagggtttttc cttagaacag ttgattttga 69720 ggggttgact acaagtcagc ttagctggtataatgatatt ccatttgctg tttcaggagc 69780 ctatctgtat accggatcag gatttccgctcattactttt atcaaccaga gtatgcttta 69840 tctggtaact tcatcggggg acttcagtaactttgttttt agagatctga caactaacac 69900 cgatgtgttt tctttcagtg tggaatatccaacgctttct cttgcaagaa tatatatcac 69960 ctacgatggg aatgattttg tcataacattcagcagtact gttagtgatt attactatac 70020 ctataatttg cccggactca gtttttctgatcatctactt attgggaatt atcaatcttt 70080 ttcgggtcat tccgcatgga actcttttattgtacttgac tataatgcga caggaagtgc 70140 gtaccagaca ataagcaacc tgatatgagccattttgatg aactacacga acattacagc 70200 accaccacgc tcagcgttaa cggggtagtggtaagtcata gttacagagc atttccttcg 70260 cttagctacg ttgaaattac gctgtacaacgtacctgcac ctactggatc aaattatttc 70320 tttgtttatg atcacgttta caatcaaaacatatttcttt atgcgctgaa acctcaggat 70380 atagggaaag aaattctgga aacggttagtttcaggatta ttgttgattg atcatcaata 70440 gataataaat tctggttttg taagcgtaatattgatctca aaccacccgt cttcgataaa 70500 cagtgcccct gctcctgaaa gtgtgtaatttcctgtaatt atattgattc ttctgttgaa 70560 atgggatgta gtcaattccc atatactaccacccgaaaca aacgtctcaa attcttcttc 70620 ctctataaca ggttgatgtt ctatctcaaccagactcata gaagcaatta tggtgcgcct 70680 gtagttgaat tcagatatat gattcatttctattgtgctg taggaaacga gtcggaattg 70740 ttgatgttcg ggaatggtga taaccgaaagagatatcccg tctactttgt gagaaaagaa 70800 aattctgttt tgagcgaagt ttgagtaaatagagtcgtgg gttttggtat atgtcccaag 70860 cccgatatat ctcaaataat atcttaccggatattccaga ctaccgctga agttgtagat 70920 tttatcaaga atgcgttcct gaagtgcagcatatcgtatt tcacttgctg taaatacata 70980 aggtatggtg gtgcggatat acgggaatgtaatataatcc agttcagagc cggtaagtgc 71040 tatgatacct ttgaaaatct catttggaaacgtgatataa cttatagata attgtgtata 71100 ggtataaaga taggtaagtc tcctgttttcaaacgtttct acaaaggaaa tggtatccat 71160 tgaatggctt gcgaagaaaa agagataattttcaatcaga tgctgtaaag tagcattaac 71220 aagagatttt gttaaacgat taaaatagagggttctgttg aaatgaaaag atacataatc 71280 taacccgtaa ttattggttc gaagtgcaataagatcagtg ttttgctcaa caggagcatt 71340 tacaaaccct gaaagcgttc taaaaagggtatatattttc cccgtctgaa aagcctctaa 71400 gttcaatccg atcggatcgg taagatatcccctgaaatat ttttcattgc gcaattgtgt 71460 attgaagctt acgtaatagc tgaaggagtaaagggtagtt ggatcaactg tatcgtgaac 71520 aggaggaaca atcaaatcat aagtcatggggagaaagtct attttttcaa tgctgattgg 71580 atcataaaac tcatttttcc acccaatgcggttgaacaca aagaacggag gaactccttc 71640 ggtggaataa gtgccagcgg gtatggaaccagacaaaacg aattctgaat aagtgggtct 71700 gtaggtgtaa agccgataga tgatagattgggaatactga gatgtagtag actgatgata 71760 taccgaaacg gtgaacgaat gggttccggatacgaaccca ctcattgtga tataaagaac 71820 caattctttg tattcaggat ttccctgatagaaattatca attatgctgt gcgaaaaaac 71880 aaaaggtgga agggatgata ccacttccacctgattgaga aatacccttc ttacagggta 71940 agaaactgaa taagtcatgg tttcatatcatcacttaagg attacaaagt ggtcagcatg 72000 aacgtctgca cggttcataa gcgtgaagctacgcgcggca agatactgtt caagcaggat 72060 aatatcgcgc tccgtcggtg atttgataataaccagttcg taaatgctac cgacaagcgg 72120 gcgcacgccg tcggtaccaa ttgtgagaatattgattccg tctggaccca ccgccacatt 72180 gttcatcaag gggataccgg aaatccggagcgaactgttc ccgctgaaca cacccactac 72240 cacctgatcg cgggtagcga gtgaaagcgtatattccgtg ctggagcctc cgattcccca 72300 gttgtggggc atttcacaga aaatatgggggttgacagaa agtgtaccac cggagaacaa 72360 cccaccgttg gcgaaagaac ccacaataccgatagcgaac ggctgctcga tttccagacc 72420 ggtaccttcg agacccatcg accgaagccattcgtttcca cggaacacca ccgccgacaa 72480 tccgttgtat gcatcccgca cgaaaatgggctggttatcc gggttggatt gtgtgaggga 72540 gtaagataga taagccgatg ttgaaacatatgctggcacc cacgcatcaa ccttatcgcc 72600 cgtgttgtag gaagccgtga gggtgttcgcatcaaagcga agcaccactt tgggcaccca 72660 gctttcaatg gtttccgccg gatcgacaaaataggggatg ttatatttct tagccagata 72720 gttttcaacg ttctgacgtt cagcgttggtaagtttacgg tcaaacacgc atagctcagc 72780 aatatagcct ctcaggttcc accctatgaacatattcgat tcaacacggt taccggtctt 72840 accttccata taccgcacac cgttgatgtaaatgcggtca agcggatatt tggggtggga 72900 agtagcaaag cgaacatcgt tggttgttggcgaaccagag agaccactaa tgtggtacag 72960 attgacatat gaacctgttt cattttcaagtatcaccgta atgatattcc agtcattcag 73020 gggtacaaaa gcgtctgcgg gttgtggaacagtattgtac ccaccgtaag agtccggtag 73080 tgcgttggag actcgagggt tgcgatatgaagagttcaaa taaatccagt gttcaacctg 73140 attttctttt acttcaaggc ggggcacaattactgaataa ctgtgaacat tattttcatc 73200 aaccattctg aaataaggta catccgtatcgtctgtgttg tgagtatccg actttggttg 73260 aggatcccac atggagaaca tccacagaaacctacccgga atgtttccgt aatttactga 73320 attgttagga atgggattag aagaattggttgtaatattg ctgtaattcg gatgataaac 73380 ccacaccgat ccactctgca caaacaccccagatgtcgta tccgggagtc tatccagttt 73440 ggcaaccatg ataatggtgc gttcagtattctgagaataa tctccggtgc ccggatagtt 73500 aatacgcata accgaaccgg aaccgaaataccaagccgga taaccgttga caatattttc 73560 gacgaaaata ggtttgcgga aatcgttaacctgagtagct ttaaatccgg aatatgcggg 73620 aacaaggttg ggaatttcgt caacgtaatcgccggtttca agctgaggag tggagccgtc 73680 ggcgctcatc cagattttgc agttggggacgtccgacggc gaagaatacg cattgatcgt 73740 ctgaacataa atgggataag tgcgaactgtttccggggta acgccatcgg tagagcgtac 73800 cgtaatgctg taggtgcccg gcgctacacccgacagatca ccgtatacac tcagaatccc 73860 ttcggtgcgc ccgtcgggaa gaatggactgggtaaaatca tatccggtaa cccagctcgg 73920 ggcggcggaa accgtagcag taatcgtattaccgtcgtta tcgtagatat aaatggaaaa 73980 cgttacggtg ttagaactct ggtaagtcggcattgtttat cccggttttg ttttaaatat 74040 tcttatttca ctcaaaataa aaagtcaaatagagataagg cacagaaata ctgctatagt 74100 cattgatgga gtctatatag tcttttccaatatacatgtg tgtgataaca tctccgctat 74160 tatagtcata catcaagatg tattcggctacatcgggagg aatgttgttc ataaagaacc 74220 ctgaaaatgt tatggagtaa atatttccgtcaaactgttc gtcaatcata ctgacagttt 74280 gcgtcagatc cccccccact acccgttacaacatacaaac tgaaagatga gtgtgtaacg 74340 ttaaattccg gataataaga attaaccggcaccactgaaa gagtttttcc atagaaaata 74400 tttcttaccg cttctttaaa agcattcattatggtttctt ttcggtaaaa aggttcggtg 74460 ttgtaagata gattgataaa tccatagattttatctctgt atagagagta actgtaacct 74520 actctatagt taattaagtt gtattcctgatttagtgtat ggttaacagg ctgactgtat 74580 acccatttgt aagcaagcca gcggtgtttttcatacatgc tggatataag cggctttcga 74640 gagatcacag gcgatgtgtt tctttgagtagcatttatcc agtaattgat aggcgattcc 74700 acctttctgc tcatgatgtt tgttctatttccggcaatga tcgctcttct taatgtgtgt 74760 ttattaatcg ttttgaacaa ttgtcgatagatagtgcggg tcttgttttt gacagtgttt 74820 ccggttctat gatatgctat aagtatctgtcggagttggt tgagatcaag gagaaaattt 74880 tgagttttct ttacagaagt gagcaggacgttaccctgaa aacttggaaa gaagtctctg 74940 gtggtacgga taacttcccg gatttttaaaagtctccggt caaaatgggg gatatcatag 75000 cttgcataaa tgttagtgcc cctgaagaacaatccgatta ttctatcaat ggttgtttta 75060 tagtttaatt ttattgtgcg tatggttaattttatcaaac gttgcatctg atttgctata 75120 aggttaaggt actttttaaa actcagaagtctgatggtta catatggctc taaggtggaa 75180 taaatgattt ttccaacaaa tctgacctgtttaataatct gttggtaaaa acgtagcaca 75240 tagttaaccg gtgtgtatcc ggtcaaagttttaaagaaag ttttattcac cagagtcgat 75300 ttaatgagtg atctggtctt gatgatagttttaataaaaa agtttttgat tgaattcaac 75360 agactgttaa aggaggtaga tatatttaaaactttaaaaa attcttcagt gctccataca 75420 aaagcggaaa gcaatttatt gaaagtttttaaattgggaa atctcagttt taaagtgggg 75480 gtgtagttga caggtgtgtt ttttgttttcaagatatact tgaactcatt agaactaaga 75540 ggttggttga cagcattata tccattgaaaagcgtaagtg cgtaggggtt gtcgtagttg 75600 ttgtcaagga aaagtcgaag tgctgtcgggtcatcatttt ctatcaacag cggatcatca 75660 atcgggtcta tataaggatc atagggaacggaaaatatat ccctgaaatc gtttgtcgaa 75720 gtataaatat aagataatgt atagttaaagattaccgact ttgtttcata tttgtcagtt 75780 gaataaaacc tgaaccgtat acttccagtataccagtcag gcggaaagga agatgccgtt 75840 aaaagattga gataggttat agtagacccggagtgcgata caccaaacca gaacttattg 75900 ggtggggatt gttgaacgat aagtgggctgaatactcctc ttcctttgtg gtctacttta 75960 aatctcattt tgttcgtagg ttatgtcatattcgtccagt atcaaagcgt tgtcaacaaa 76020 tagagaattt tcatatccgt gcatgaatggaatgggttct ccatcgagtg atatgtcttg 76080 aataccaatg aatatgggaa aatacagactgatcgatata tccattactt ctataatcgg 76140 ataatgatat tcgatagtta tattgaatactgttatttca gggtaattca aacttatggt 76200 gaagataaac ccatattcca taagagggtgatatagatta attgaaatgg tgttgaagtt 76260 aatgtcaaaa tcgagggtgg attcatattgtgtgctgaac gttttgccgt ttatgtcgct 76320 gagatcaggt gaataaagat aaatattaaaatcgaggggg gtatctataa tcaaagcttc 76380 catatctata taatagaggt ctatatcggcaatctttttt ccttgataat acacatcata 76440 ttccagatct tttgaaaaac tggtactgaaaaaataccgg taagtgtcaa acagatcaag 76500 aattattgaa actgtgtgtt cgttttggaagaccagacta taactgacat ctgaaactaa 76560 cgtcagacta ctggtactta caggatatctgaaataaatg tctctataac gtaaatacct 76620 gtcggctgat tcggtatata cttttattatgtaatctgta ttaatactca taatttatcg 76680 atataagttc gtttcccttt ttaaccagaatagtactgat gggtacattt ccgttcatga 76740 attcaccaag catatcgata ataaaatctacttcttccct cacttccagc ataagtgtgg 76800 ggggtatttc aagatagaca ataccatcttgataggaacc ttttacatat tcacggtttt 76860 tataccaatt gatatatcgg ctcatagaatgttgaatttt ccaatcgtct tttttaaagg 76920 aaaacattcc ttcatatttg ttgaagggatcgtaaacgaa accgacgtag tctttcatgt 76980 ttttctttaa ataaacaggc ggttgtgtttttattcagaa aaaacttatt taaagaaaaa 77040 agatgtatac cgaactgttc aagaaaagcaacccgcacaa ctcatattac tatcattacg 77100 tgcattttga cagtaattca aacacacattcaatcgatgt tcccggcgga aatgcgctca 77160 aaaacattct tattgtgggt aacgcttctaccccttattt tgtctctttt aaaatctata 77220 catcgcatag cgggtttgtg ccggttccagtatcctacga ttacgaagcg cttggaaaca 77280 atgcgctgat tacccctaat atctcttcatttgcagtttt ttcctctatt caaacctcat 77340 cgcttcgcat tagcattacc aatatcaccccgtttagcgg aagtgtttac atactgttta 77400 aagtcgagta acgtatgttt tacgaaccttctgtaagctt ttttgcagta tatcctcagt 77460 acagcaccag cgcggctttt ctcacagaattcaataaatc atcggcgtgg gtgctccaca 77520 aactgggcta cccggtggta tcggtggaattgacgaaaga tcagcttatg tttctctttc 77580 acgaagcatg gcaagaatac tctcagtatatttcagaatt tctgattcag gaaaactatg 77640 ataacgtttt aataaaaaac attttccagacggaagggga aatctttgag aagtttccca 77700 aacctaacag ttcgcttatc atcgagctttctgatcgcta tggaatgtac gacatgaaca 77760 ccgaatatgt aatcattcca cttaccgcttctcaatcggt ttatgacttg aagaattaca 77820 ttaccgcatc cggaaaaatt cacgttcagcaggtgcttgt caatagaccg cgcgttggtc 77880 ttggttctac gctgtacggt aatgcttttgtcttcaacaa ctattctccc ttcaccgtag 77940 gatacggcgc gggctggaat atcggtcaggtgctcacgcc gctttcctat cttgccacca 78000 ccatgcaggc taccgatctt gcctacaatatgtatcgcaa gctccacttc tttgaaattg 78060 tctctggaag tatgattcgc atttctcctgttcccgattc caacgactcc cggcttacaa 78120 tcagatacaa actggaacgg gaagaaggtgatcttattga aatgtacaat tcaatatttt 78180 atacgaaaac aggtctcctc gatctggaaaaactaaatga aaactccctt attgtgcttc 78240 ggcatatctt cctcatgaag gtgatcgatacgcttatttt catccgcaag aagtacgaca 78300 actacgcact tcccaatgcg gaacttacgctgaacgtcga caacctgaag gaactcaggg 78360 aatccaccaa ggaaaagatc gacaaatacaaagagtggct tgacaacatg aaacttcacg 78420 caaggcttca gcggaaagga gaagaagcagaagcgctgga gcgggaactc cagcgctatc 78480 ctatggggtt cctatttatg taatctctcacctgcaatca ctcagcgtgc aggcgcccat 78540 ggggtgatct cctctaccgg gctttttgaacggcggtgga atcttatgtc ccggtttgtg 78600 gtgtgcagta acttcctcaa cctttccataggttgtgagc atgggggtaa tccacctttt 78660 catggcttct ccgatttttt attgttggttcttatagata aataaccctg tggcacgcat 78720 cgtaagtgaa aaaccacccc aacagccaccaccggttcac ataacggaaa aattctttgc 78780 cttttttaaa ctctttcatg attgttcttcttttggaaga ttcagcttaa tggtgatata 78840 atccgagtcg gggaattctt ttcttaaatcctccagcgac tcatacacaa aaatcatccc 78900 gacggcaccg gtgttagcta tcttagaaagtggatagacg actttctgaa cgccgttatt 78960 gatgacaacc tgcaggtcat ccagaaagttaagctgcatc gcgacgtaat acacgcgctc 79020 ttcgttattg ttgtcgttca tggcacacagggttttaagg ttacgcatgg tagtctattt 79080 ttacaatgta ggttttgtcg ttatattctatcatgtgata atgcgcctga taaatatggg 79140 ttccttccag aaataggggc tcgttaccttcaaggtagac gaacaccata tcttcgtctc 79200 cacccccctg agaaaggcgg ataaagggagtgtgctgact ctggtgagta ataccgataa 79260 caacatcggg gtttttctcg ataaaatcgagaatttcccg ctcctgttca gtgggttcaa 79320 aatgcttcca ttgatacacc cggtaggggcgattcgaagc gatgtaatag ggaagagaag 79380 ctacatgtcg gttataaata tccagaacaatctcttcaat atcagtattt tctttatttt 79440 taatttcttc ctcgagtagc atgttaagatcaagtatcat gtgagccagc gtgctgactt 79500 ctgcttcatg cattttaata cctttcttttcaaggatacg aacaatgcct tcagtgtcga 79560 atttgagaat catggcgcta atgggtttagttttcactct ctacaagaaa acgaataaga 79620 tcctcaacgg tttttatctt tgtagtattttcaacatacc cgagactttt cagcagatca 79680 attgttttct ctgaaatgat ctgacagtatttgtctctat ctatgtattt tttaacgatt 79740 tcaagaccct cctcatcatc ctcacgtatgctaagtgcgt ggatattgcg gagcctgttg 79800 atatgcgttt tctttcccgc tttcagaatatcccacacgc cgctcccatg ttcggggtta 79860 acttcttcca gagaaagcgg gagattggttctggaaggat ccagcatggt gcagtagaac 79920 cagtagatct tgtctcccat ttgcgggggtttgcatccta taatggaagc gaaaagcgct 79980 cccttataat gaatgggaag tgtatggtcatacattttta tgaatatctc agagattggg 80040 acattctcct ttttcatctt cataacttcctctacatact ccacatatct ttcggcgctg 80100 tcagacgaag atattttcat tttgtgatagagatcttcaa tagaccagaa attcttttga 80160 gacacaaagt tattgtagaa cgctatggtggcggaaatga catcgatgtc gggttggctg 80220 atatacttca ggtaacccct gaaatacttcttgacaattt caggcaccga agagttgatc 80280 acttcgattc ccttcatctc ttctttaccgtctacagtaa ccgcaaagta gcggttgatt 80340 tctttgataa gaatggattt gaacacgaactcctgcttta actccagctt gaaatcttct 80400 cttgcattaa agttattttc catatagtcattgataaaag agttgagatg ttcttgaagc 80460 tcaccggctt ccgccaccgg atcatccgtaaaagctttga cgaaaatgga gtcggtatgc 80520 gaataaatga agcgatcgcg aatctgagaaatcacggagc gaatagacat gcgcccggcg 80580 gcggttacac tttccgcaat gggaaggcaccccatgtaca ccgaacggtt tccgaagata 80640 ccgtacatgg agttcatcat aattttaagtgcccattgac ggaaatggtg ttccatgttg 80700 ccagtttctt tgaaaagctt acgttcttccttacgtcggg tgaaaatctc ccgaatgata 80760 gaaggaagca cgccaaccgg ctctttcctgtaaaaccagc agatacccga cgggttgggc 80820 accataatga tatttcgact ttttaaaaattgccggagtt cctcaaagct gttgatgaca 80880 aagaggggtt cactccggta agaagggttcatccctgaat cgaagatgta gaggggaaac 80940 ccgaattccg gttcttcctg atctaccggaatcactttgt tctccacccg catacacccg 81000 taaaactccg ttacgaacgt agcgggatcgatattgaatt tgctgattac agaggggtac 81060 agcgatgtaa aatcaagatc gaatacgttgaagtaaatat cggggttggt aagttcaatg 81120 taagcaccgc gataacgata cttgttgatgttcatagcag aatacgtttg ttttacattg 81180 cgggatcaaa atgggtatct ttcacctcgataaggtaggt gtcttttaaa tcacatttgt 81240 ataatacgcc atcgagtgga gaaagatcaataattctcaa aacttcatcc gtatagaatt 81300 ttcgctcgat aagattatgc gtttccagacgtttgcagta ctcgaagata gcgtctccaa 81360 caacatagct ttccaggtgc cagcaggcaagcccatcata atgataaagc ccccaccaac 81420 catttgctct ttccctgacc actgcagggataacccccag atcttcaaga aatgcttcaa 81480 gtttattttc atccatataa agcaaatcgcgtcgcattat gttttttccc cgctcctcaa 81540 gaaacgcatc aacatacatc cacccggtttcatcccggat aagcttgcga agtgaagggt 81600 atgcctcaac cgatgtaaac ccggtcttcaaatcggtaat aatatcccgc gaaaaccgca 81660 taacgaaata ataaggggtt tcgtagtcgataagtttttc aataatttta atgatttcgt 81720 acttcatggc taaaacgcta cttcatggctaatacgctac ggttagttta agtgtcgggt 81780 aaacctttcc attaactaat gccacgcccacccccccgta aatggatccg aaaaatctct 81840 tctggaaaga aaagtcgtaa taattgagccctacggaagc gctgataagg ttgtgcgttt 81900 ttctaacctt cagcgcaaaa tcttcctgcatgaaacgtcc ggtttccggg ttgaaaaacg 81960 ttacgcggag cgtgttgccc ttccacgtggcatatctttc cggaagaagc ccgtagagtt 82020 tcatgtccac cggcttcgga cactccaccgtgtcaatctt ccccacgggc gtctcccggt 82080 aaatgatctg tttgaccggc tgggcaaacttcccttcaac cttcaattct gaaggaaaaa 82140 ctctgtccga aagttccact ctgacttccggtctgtaaac ggtgcggttg acgtaaagat 82200 gtatgttaac cgcgatcagg atcaaaaggagtgcttcttt ccagtacttc ataagtcttc 82260 ctcttcttgg aatttgtctt cttcgtctctgatcatagcg tatagaatga tcagatagtt 82320 aattgcgtca atgattctac cttcgacagcatcccgctgg tttttaaccc ctctgatcca 82380 gcgtgccacc cctcttaaat gtttatccagaaatacatac agcacttctt cccttgaaat 82440 acccaatcgc tttgcagttt cttcaaaattctgaaataca ttgtcggttt cggcatactc 82500 ctgttgggct tggagtcgga cacgatttacttctccaata agctctttta caatgcgttc 82560 gaattttgtg gtattcatgg atttcctccattaagtttct ggtatttatc ttatttaaag 82620 aaaaagatga atactccccg caaaatatttcttaatccac ccacctcaag atccctacag 82680 gatattgaat acctttacct caccaacaaacacatcatta ccggcgcgat aaataaagcc 82740 ggtatgagca ttgatgaagc ttgtgaatgtgttgtggggg ggatcgtgct cgaatataaa 82800 gaaacacacg gcatcaatat ttttgataatctgactatgg cggtggagta tttcattaac 82860 aggtacaaag aggatttaaa aaccgggcgcatttaactca tcatcttttc gttgataaat 82920 tgaatgagtt cctgtacgga aggataaaactgttcctcga aaatctggtg gtttgtcttt 82980 atcaattctt caagtttatc cacatccgccccgatctttt caaaaaaggt acgggcttcg 83040 gaaaacagtt cccctctggt ttcacattcggtataaacca gaagaggaac cataacccgg 83100 tagaaaaatt cttccggtgc ttccctatcaacatactctt caataagttt ctctgtaata 83160 ggaaccccgc acgcctcaaa cacgtcatcctgcacggcga agtgaatgac gccgttatac 83220 aaaaggtgcc gcaaccgcac cttccacatcagggaagggt cggttacgaa ctcatacacc 83280 atttggagag cattctcgag aggaacctcctttaacggaa gctcctcaat ttcttcttcg 83340 atcgggtaga agacgttctc ttctttggagaacttacgct ccggggtaat tataatccag 83400 agcgcttctt tgacctgagc agcgttcaagttgcccatgt tgtacctcct tgtttttgtt 83460 agttacagat taaacaattt gctggttttctcgaactcct cgaaccactg tttccggagt 83520 tcatccgaac ccctgtattt caggggcttgctatcttctt taactttcga ttccggttca 83580 ggttgctcct gggtttcgag tttcaggggaatttcaactt tccctttgag ttctttgagc 83640 ttgttctcaa tacccggcgg aacaatccgatttttgaaga ccagttccac cgctttgtta 83700 atcacttcat cagcggcttt acgggtgcgctgataaatgg catcctgctg agcctctttg 83760 agtaaataca aagcgcggcg ttttagataatcctcgctgg cgcatctctg catggtgaac 83820 tccagcgaac gtcggttgtg gatttcgctgttagccataa ggctctccca ccacatatac 83880 tctttttcag aattgaggat ataagcccccttgaatctga cataggtatc ctcatcaagc 83940 ttgaaccagt atgcatcata gaaactgtacccccgcaagg aatcgtcaat aacgtacccc 84000 gcaaggggtt cgacaaaaac aagctccagatcggctactg atagtacatc gatcatatag 84060 tcgcatcctt tgttcatagc gtctttgaatttttctctaa tcatggcttt ttcctccttt 84120 ggtttatcgt taaacccacc aacgtcaaaaagagggagtt tgtttttcgg atggagacag 84180 aattcggagt ttttgggcat accccgcttccagtgttcca cccagctatg acatacttcc 84240 tcaaacatat cggggtatgt ctctctgaacttgggaatta cgtatcgata gaattcgata 84300 tcgtatttga atagcgcatt tcttataaccgtaggatctt tttccattgt gctctgtacg 84360 cggagatttc tatgaaaatc aagactctggtacaggaaga aaaacgttgc gttcacaaca 84420 aaaagatatg ctgtaaactc gatccaactaaatgctctat attcggttgc aacagggtaa 84480 aatttgtaaa gccagtagcc cacgaaaccaaacgccacaa ccttcaaagg ttcgtgtacc 84540 ttgttcacaa cacgcgggct gtaaagcttaaaacccgaaa gcacgttgta cgtgccggta 84600 acgaaccacc tcccaatcca acacaaaccgatagaacccg atacaagaat ggtgatcatt 84660 gtctggctca tgtgggcgac ggcttcgtggtgcccgagca taaacgggaa accaataaac 84720 cgcccgaata ttagcaagac cgggaatgacagaaggtccc cggttataat gcagacggcg 84780 gcaatgaatg caatccacgc aataccggcggcaaacgtaa acgacagata aagggcgtcg 84840 ttaaacgctt ccgcctcctt gcgccacagcttcaaattat ctccataata aaaccccaga 84900 accggaaccg ctacaagata gcgatccagcccctccacca cttccttctt gctcaactca 84960 agttcgccct taccccactt gaggagcttatctcgataaa gcttgaaagc tgtaagggcg 85020 tactttcttg ttactccggc gtacatggctttccgggttt gagttatcaa tctgcttata 85080 acatacgccg gaaatccaga aaagtcaaggggattacgtt aattttttac gaagaagaag 85140 acgtgctacg tcgtcaaagt aaatggcggttttagtggcg ttgagacgct tgactttgag 85200 cggtttataa agtctgttct tgataaaaaacgaagctctg ataattttcc agacgtttac 85260 gtaaaaacta ctattctttt cgtaaaaaatgcagagggcg ataaaatcgt cagattgggg 85320 gttatagaca agccggtcat ttttctgaaacacccaagac ctctctattc tggtgtatct 85380 atatctgggt acatcctcac aggttttaacatgtatatat ctaccctcgc atatcagatc 85440 agccgcatag gatttgtctg aagtgatagtcagatcgggt ggggtgcatt cataaccaag 85500 gttggtgaga tattcataaa cggcgaattctcctatttta ccaacaaaat aattccattt 85560 tattctttcg gggttgtgct ggtggcgctttttgtattgc tcaagcacaa tcccgtcgtt 85620 tatctgattc ttagcatatt ccatgcagatgggcacatac tgatctactt ttatcatctt 85680 acccaccctt acccaagagc gatacttgtggacggattga taaggtacat caccgccgtt 85740 tcacggtaaa acgttgattt tgtgtaatggatcccaagcg aagagacaaa gggagttccc 85800 ggataagcca tttgtgtgtt gtatctattgacaccgcttt tcggaaatgt atctgctaca 85860 atggaaaaga ccgcaccctc gcgctctgttgctatacggt cgactgtaac ggtatccgga 85920 attggaattt caccatcgaa agatttgaacatactgaaag aaaattgctc aaaatgaggg 85980 attgactgag tcatctgaaa tgcgtagaaaggtttacctt caattaccac tcccgcgaag 86040 aaattctctc cggcaaaaag ctgacgaagcaacaagatag cgtcgcgggt tttgtttacc 86100 acctcaagca tatcctgatt gcgggtttcaataacgtgta cattatcccc ctccgccaga 86160 tttctaacgc ggttggaaag atttcgctgcaaaaacacca cggcggaata aggagtatct 86220 tcgagttttt gaacagggtt ttcggtaatgatatgatccg gttgataaat cgagccttcg 86280 aaagcgttat ccgtaatgac tccgatatgtctcagattgg attccatgta ggaaagcgtt 86340 gaaacataat aactgtaact gatcgtcgatgtggtaacct ctggaaacat gctgtaggtg 86400 tggttgtcaa tgagccacat tgagatagtactgtcagatg tacctttcag gtagaaataa 86460 accgaagtgg aaataacagt aggcattaccatcatcatgt agctcatgct gtaaagctga 86520 tctattccgg ggaaaagcag cgtaatcactttcccctgaa gcaccacatt gaacggaagc 86580 agcgaaagac cattttccaa atctctcatgtagcagacgc cgtggaaatt cagactatct 86640 acaacaaaac cggaattatc aataacatttacagaaactg aagttataag ctcgccgcga 86700 gatgtataga ccggatagtt aaaagccccggttacataat aggggttggc gttgctgacg 86760 gtaatggaga aggtaataga atcgagaaagctattggcgg taacttcata atattctcta 86820 atatccttat gttcattata gggaatgaacccgtaaaacg taaaataatt attcaacaca 86880 atatcttccg gaggaatgtt ttgacttaccggtataatgt gttccgtgat gtaatactga 86940 tcccccagaa acgcgtcatg aatgatcacatcaaactgat aatctttatt tcgtaaatcc 87000 agcgattcag ccagcgtgtt taaccatttaactttcattt tgcgattgaa ggcgtgatag 87060 gtaatacgat tgatctgatc gtgcagcgcgtcgaacgcat cgagaagttt tccggtaagc 87120 tgctcaagcg tttgagacgg gttctgagttgtcggttcat tcaaaatatt tgtttccaca 87180 aactgaagat cattgcggaa gtagaaaacatccatgatat tcccgatcat atccatatag 87240 gtgagatatg aaggatcgaa cacttcttcaggaagtcttg aagaaaggcg atacgggttg 87300 gatacgtcgt attcagtcgc agcataaaggtgcatatcaa gaagcacctg atacggattg 87360 acgcccggat aaattgaaga aaacgaaggaaatacttctt cccagaaatg tatggggagc 87420 agataaaagg tacccggaag cgatgtcgagttgtagtagt aatccagata tcccagcgcg 87480 taactctgac gaatggagtc gtctctaaagtaatggttga tgtgcgttac gtcttccgaa 87540 gcaaaccatt taatgaatgg aaggaagctatcgtagagtt tattttttat ttccttatat 87600 ttttcagcct tctccggata aaaacctgaaagcgtggtga taatctcatc atatttcctg 87660 agcagcgtga ttcctttgaa cgccgcctgaatgagcgctt ctgctgaccc gaatttgaca 87720 aatccggaga gcagatggaa attggatttctgctccatac ttccgaagct aaaagacggg 87780 atataggtgt tgattcgatt gtattcaatgtattccagta gatcgacggg actcctatct 87840 tcaaattcga gatgaaacac gttgttggcaaacgcatctt ccctaccccc aatggcgctg 87900 aaggtaaccg tgtctgtttc cgtgagatttaccggaatgg attcctgcaa atccacctga 87960 cgcacaagat acacttcttt tgtcgagtgaaggagtgatg aagttccata tacataaaac 88020 gtttcctgat aataaaccgg ggtacccagtttatgaatat tcccgtttac tttaaccaga 88080 atactgtagc tattctggat tcgagcggttacttcagcgt aatagttgaa caggtcttca 88140 aaagtgatac tggtgctgat gtagttttcaaaagcttcgt taatctggtt ttcaagttgt 88200 tctatggctt cgacggcaaa tgaaatggtgaggaatatct gattgagacc ggtttgcgca 88260 gaaaggtatt ctttcgaaag tggttcggcttcttcaagaa cttgctgata agcggaaatt 88320 tcattttgat aaaaattatt gaaaaactcctgtacatcgt tgaaactgac ggtagaggtg 88380 agttccagcc gggaggttgt aaaactaaacgcctgctctt ctacgtagct gtaagttcct 88440 aattttttga aataagcgta gacgggataggagtctacag agaaggtata aacaaacgga 88500 agcgaaatgg tgggtttttc aattccttcaaaaacttcag ccggaacaga gtcaaccacc 88560 agatagacgt ttccacttct tacatacttggaggttactt tgccaataaa atcaacggaa 88620 taaatgattt ccccggctga agaaccgctgaagtgttcaa tgtaaatgta aggatttgtg 88680 taggaaagag aatatggagt aaacccggctacaatttgag aacccgaaat aattataaaa 88740 tcttgcattc ctctttttct gttaaataataacgcaataa gtcaagtgca tctttgggat 88800 agaaaggctg atagtctttc atggaatgtggatattcgcc ccagtctttg taaccggcgg 88860 gaggaaacag aaaaccaact ttacatacgccgctatagag ttccaacact ttagaaattt 88920 cttcaacgct tacatccgaa tcaaaacagaacaccagttc tttaaccttt aatttttgca 88980 caacgtagga atccggaata cgattctttccacagagtac gcacattccc acacccaacc 89040 catccgttgc atggggaagc atatcgaacattccctcgaa cagataaatc tttccttttc 89100 gagccgcttc gtagaaatag accgggagcttacccaccat ataggaaaga taccgaacct 89160 tatcgaaagg ttgatagaat tgaacgtttcctatggaatc accgaaggca acccgctttt 89220 catctaccac cttgaaaaat ccttttgacgacatatagct gagaagttcc ggttttaccc 89280 ggcgctcttc aatgatgtgt ttgataaccggatattcgat attttcctct gaaagaggag 89340 ccgccttttt aaaaagcgag cggtagtaaaagtttttctt gacgtttgtc tgttcaacat 89400 ccgtaaaatc gatatcaccc gaattcttataatatgaaat gagttcgaca ggtttaaatc 89460 cgaaaatgcg ctcgaagtct ctataaacggtgcctgaaaa cccacaacgg aaacagatga 89520 aaagaggggc gtctatggaa aagtaaagcgtgagtcggcg gttgttttta tgcggggcgc 89580 acttagggca caaacaggct acttctttaccgcccccggc aactttagct tcgctgaagt 89640 atttggtgag gatttctaca atcatggttttttctacaaa aactctaagg aatcacaatg 89700 gttcccggtg tagtaatcac aggatgatttaaagtaatgt tcacatcggt taccgttata 89760 tccggcggaa tcgttgaaat gtcaatcgggatatcgcagt atttgctctg agaaatctct 89820 acggcttcga cggttacgga aaaattgaaatcatcgtcgg tatcgaatgc agccatgaaa 89880 aactgttcat cggtacgtac gttgacaatttcataccatc ttctggattc cccaagcacc 89940 agatctcccg gttgtggaaa atagtcgaattctttcagca cattcctaag aagatgaagt 90000 cggagtttcc gcaccgatcg cattccaacctcctctgaag ccggttcccc aacctcatat 90060 tccacacgac aggggatccg gtacattttatattctctaa tatctttctt aggagactct 90120 ccatacagat agtgcaatac gtcgttttcgtcatcttcca ccgcgttttc aataatacga 90180 acaaaaagga agttggcgtt gagaatatcctcaagcgctt caagggcaaa atgctgaaga 90240 agattgagtt cccgtcttcc ccagaaaagaggattacgct tttttatcat ttttatttaa 90300 ataacccctt ttcaattctc tccgttagaagcttcttttc cgacttttgc attttctctt 90360 cgtcgccgtc aaggcgggca aaaaaccaacttttatcatg ataatccata agtcgaagag 90420 cgaacctttt gcgaaactct ccggggttttcttccggaga aacctgttcg gaaatctctt 90480 tataaatcgt atcaaaagat gactcaagctgatttcgcat atcggtataa atttctttga 90540 gtttcatcac ggtttcctgt tcatccggggtaagtacaaa atcatcaagt ttgttttcaa 90600 gaaaaagatc ggcgagcttc tcaggagtgattgtagtttt aatccggtgg agctccagat 90660 ataccgggtg cttgatcttt gtgcggtaataaacacgcgg ggcaatttcc tgtacggcta 90720 caaatccttc atataccacc tcataaccgtctctcaggct tttgaaaagc ggtgtaactt 90780 cctcaaatag ttcctgaagg cgattggcacgaaaaagagt atagttttgc tcttgagaca 90840 gaacagccgg tagcttaaga tttatttttccgccactttc gttgaaaatg cgtacggctt 90900 cttcggaggg acccacctcg aaatatcccttctccggatc caccgaacgc acaccgatca 90960 gaatgatatt tggctcctca taaggaaccaccactcgcgc gtccggatga accatttcaa 91020 atatgtaaca gtatgaggag ttcaaatgatagagaaggta aggcggatat ttcttttcaa 91080 aggtttccca gaacaattct cgatatgttttatccatatg agtggtaacc attccgtttt 91140 tgacaatgga tccatttgcg tcaatactcccaagagtgtg aattttccac ccttcatcat 91200 aatataaaac cacacaagta ccatccagcttttcaaccag tttcatggga agtttgaaca 91260 tgaaaccggc tttgcgcttt tcattcaggggagacgcgta acgaagcgtc tgataatagt 91320 ttacgatttc cggctggagt tcttccccccagttgaaaaa tttgtcaaag ggataagaca 91380 gaactttcca accactatcc gttttgcggagaatcgcccc gcgacaggca aggtgatata 91440 tcttatcaaa cttacaacca aggtgatatttgaacatgta tagatcaccc cggtttttgc 91500 acataatccc ctccttgcga agggattcgacggctacttc cggagactca aaagagttca 91560 ggtgttcgat aaggtactca accgggtattttacgttcat cgattccata gcgtactata 91620 agtcttgttt tgagttttcg gaagcgtcggttgatcaggg agttcgtcta caacttcata 91680 gtttccggat aggaaaccgt agcaatcacacacctccttc gtaaatctgt ttacgggaat 91740 tggttacaag tttttcagct acgcgcaacattttgttact ccacttttca accggggttt 91800 ccacaaggaa atgaccaata cgggtatcaaaccggttgaa gcccacattg ttgcgctgag 91860 ccgcgcggtc tttatccagc gccgccacgatcgccagctt ctgctgaagc tccagcgcat 91920 aaccccgctc ttcctccgaa atggttttactttcttcttc ctcctcccgc tgctgcttct 91980 gctcagacgt agtaatcaca tcgagcagagatacaggctt ctcaagttga gtcttcatac 92040 gctcatgatt gagtgccttt tcgataatttcaatcttgcg cgtcaggtaa tcggcaaagt 92100 tttcatccag cgtatgacgc gcaacaatgtagtgaatatc cacacattcg gcttcctgac 92160 caatgcggtg gagacgatct tccgcctgcaggatattgcc gggtacccag tccaattcca 92220 caaacacggc ggtcttagca cgcgtcagcgtaatgccgac accagccgcc agaatgctgc 92280 agagcaccac gtccacctta ccactctgaaaatcctccac cgccttttga cgctgcacca 92340 cattttcctc gccggtaatg cgggcgtaggtaataccttt agcttcaagc accttctgaa 92400 tgatctcgaa cacatcatga tggtgtgcaaacacaaccaa cccgtccact tcttcctctt 92460 tcacaagaga aacaatatag tcagcagcgaacggggcttt gtgaatggca taaaagcgcc 92520 gcatttctgc aacgcgctca aacataaccttcattttttc atcaaactcc gccattgcct 92580 cagccagatc agcgctttca accccaacccgctcaaactc gcggagaacg gaaatataat 92640 ttttgagatt ctggagatct tcagccagcttgaaaatttc ttcttcagca aacatcttat 92700 taagttttac aggaacgatt ttacggcttttcggcggaag ctccttgagc acatcttttt 92760 tcaagcgacg aatcatgata gtggagcgaagctttccctg aagttcttca aggttacttg 92820 caccacgaaa atcccaacca tacccattatagtaagcgtt gcaataccgc ttggcgtagc 92880 cccagaaatt accaaacacc ttcggagccgccatctcaag aatgggataa agctcaatcg 92940 gtctattgac gataggagta ccggtaagaaagagcacctt cccgccctgt tctatggaag 93000 atttgacaat agattttaca aacccggagcgcttcgtctt cgggtttttg atataatggc 93060 attcgtctac gatcacaaga tcgtaagcataatcctcttc cgaaatgcgg tggagaatgt 93120 cataattgat aatgtaaatg gtgtttttcagagaaaaatc gacttcattg ccgttaacca 93180 caataatttc tttttcgtga accacccagcgcttcaattc ccgctcccag ttgtacttca 93240 gagaagcggg acacactacc agcacgcgatcggggttcat tacattgata accccggcgc 93300 tctgaattgt ttttccggta cccatttcgtctgcaatgag agcacccgga tattctttaa 93360 aaacttcggt aacaaaatgc acccccgccttctgaaatgg gaaataatca tatccggtag 93420 gtgcaggtac ggcaaaatcg ctgctggtgacgctgctgag ctcgagcttg tgattttttt 93480 cttccaggag aaggttgtac tgctgagcagccttttcgtc gaaataacct ttcagtttac 93540 tcgcataatc gagaattgtc gtataccataccctcttatc cggatcccac ttccacccgg 93600 catttttggg gatcagacgt tcttcgtaggttcctttcca ctcgaaccgg ttgttgtaag 93660 taacgtagcc catgaccgcc ctgtctttggctgtcaatct gctgataata tacgacactg 93720 aacaagaaaa gtcaacccct tgacagaaatttcaattaga agggaaacga aagttgaaca 93780 aacaggtgat tgaaatgctt ccgatagcgcgtcggaagat caaaggtgtc gtggataaac 93840 ttttcaaatt catctttttc accctctaaaacatcatcaa gccatagtct tcgacgttca 93900 aggctcccaa taatcgcctc aatcggatatgcttctataa aagcattacc gggataatca 93960 tctacaacac tcctcacaac ttcttcaatgtttgttgtat ctatcaatcc cctttccgta 94020 taatacacta tatgtggttc ttcgtttgatccgtaatctg attctaagta tactataact 94080 tcatattcta ctatatcttt aacatactcgataatacttt ctattccact ttcaaccatt 94140 tcctctactt tctcccccca ctcttcatcaaaatgatcta aaatatcatc cgagttgatt 94200 attttttcta taagatcata gtttacattttcaggtttaa ttgtgctgta aaaatattca 94260 aaacagccgg ttgtcgacat atttttataaagagaagcaa ctacatcggg atatgcttcg 94320 tgttccgccc atttaccaat atgttgtttgaaaagaaaat aggcgatacg ataagaacct 94380 ctgttggctc tccttttttc tatatcttctggcaactcta tattaagata ttcatgaagg 94440 tctttaatac cctccgcttc ccccatatgatcaattgcga tatctacaaa ttgtcttatg 94500 agatcgacaa ccaccgaaag attgatatattcctcgaatt tattcttctg ataaagcaca 94560 aatacctgca gggtttcagc gggaatataaacagtatcat aattataagc gaggacgctc 94620 cttacactgt ctaaagccat attaatataatattcgattg catcttctga aggtgcaaga 94680 ttttctacca gatggggata tttttcagcaatcacatcca gcatccgcgc cgcataagta 94740 gcatagggtt cttcgtcaat cggtaccattgtattatcag gaaggtgaaa ttcctttccc 94800 ttaccaaatg tagcgattgc gtaaacaagtccgtcatacg gattgctact atcggcttcg 94860 aaatctattc ccggtttgat cgccgaaggatgaaataccc ccacgaacaa gggataataa 94920 tagttgaaat gttgatcgcc aagtcccacacacacatgaa tatgattacc aatcatcccg 94980 agaatacgtc tgttttccac cacattttgggggtgatata tgatcataga attgtcttcg 95040 tagataatca aatccgcata ttctgtgtgcataagctgac gaaacttctc ccacaatgcc 95100 agataagaag cccccacctc ttccacttttccggcgcgtt gatatgcccc aacgactttt 95160 ccgggatcga gaagctgaag gttatccgtaagatcaaatt ccctcaaaac ggagggttgg 95220 tgtcgcaccg aaaaatacca gcgaaatggctcgatgtaaa gagggagatc gatcggatcg 95280 attttgtcgc tcatcaccat acgataaaaatattcaacca gcgcatttgt cgagataagg 95340 ttatattcgg gattgttgcg aagcggctcaaaaatagact gagcaacttt gtgaagaata 95400 tttccaaaaa ataaaatacg ctcccccggatcacggggaa cttctccaag atttaattgc 95460 tgagccagaa acctgagttt gttttcgggaagttttgcaa gtagttcttt acccctacgg 95520 gaccccggag caggaaaagt aaacgccatttttttatttt aaataactac gccccaaaat 95580 ccacataaag atagcaaatt ttccagctatcctccatcga gatagtatat tttttatccc 95640 cctctttttt aatctgataa tcttccttaccgacataatg taaaaattgc cttataagtt 95700 gccacaacct atcatattcc tccgtttcatcagtgcaagc cagcttgtac tgataatcaa 95760 cgaaggacat atcgtttatt tcatcataaaaattgtcatt gccccccaca atgataactt 95820 ctggatcaaa tacaacccat tcgataaaatcaagcataaa ctggcttaat tccggatttt 95880 taagaatata acgccagttc cagcgcgattcatgatcaca tcgatcaagc ctttctatca 95940 ccttataatc ttcattatat ctggcaagtagaaaacgtct catttcggca acgttcatat 96000 cgtagctcag attatctgaa atatcgttcaacatctcccg aagtgcgtct ataaagagac 96060 gcttgatagg ctcgcgttca aacatcaattctacaaactt cacaaccaca caatccagtt 96120 ctctatgatt tttcagataa gcgtcaatgtatatatccga cgccgcgttg gcaatcagat 96180 agcacatctt agccagcggc gttttgcagacaaaaaattc ccacccgaaa tcgcaagggt 96240 aaaaatcctc cttcacctta tcaggataacggctgataag gatggagtgg gaagacgatg 96300 aatttgtgga aagaccgaaa cgaatgaatgctttcatgac ttcctccttt gtttgtcaat 96360 ggttacatag acaacgtaaa aaacaacggtaagaataaaa agaagtaaaa actttatgtt 96420 caaccagaag ggattctctc ctcttagcaccagcgctgca gggtttctca agtataatga 96480 aagataaacc aaatacagcg caagcgccagcgaaatagag cctatcagta ttttgaacac 96540 tttcatagtt ttttccagat gttgagaaaagttaccggat cgaacagttt ttccctcgac 96600 accctgaaac ccttttcttt caaataggatcccggagcaa caagtgcttc cggctcactc 96660 atgtcgatgt agcaggaaaa aagtccttcctgatcggtag tgctatgtgg aaattctctt 96720 ttgaacaccg ggtaagtttt aacaaaaagggaatccaccg aaatggtgag ttctttcata 96780 aaatacttca ccgcttccag tgtcttgttttctggaattg gtatgttatt ataagtttct 96840 cccctcccca ctttcttaaa accaagtagcagaagatggc tcccctctac ctgagaaagc 96900 atttgcatca tttcaatggt ttcttcaaagggaacgctcc cccatacgtg ctgagccacc 96960 agttgcacat tggggggctt ctgattcagaagatcgatat acgactccac cgatttcaaa 97020 ccatgcacgc tgaaacctat cccaaaaagccgattcggaa aataatgccg gtagaacttc 97080 ataagctttc gggcaaacga agcattgaaggtggttacgt taacataacc ccggctgaac 97140 gttttgacaa tgcgatataa ctcttccagaaatctcccct tccagaaaaa gcaggggtct 97200 ccacccccta tgctgagttc gtaggtacccatttcgttta acattcgtgc gaaccggatc 97260 atgtcacccg gatcacactc tgatccctccggggtggagt tttcataaca gaaagcacac 97320 ccaaaattac acacgttgga gggtttgacatcaacgatat gcggcacctg agtcttaaac 97380 atgattacct ctgagttttg ttttcggtgcgtatgttttc cacaaaaact acataggctg 97440 aaaatacacc aagaataaaa atctgagcaaaccccacaac ccccgttttt ggatttactt 97500 ccgcttttac ttcaaaaacg atacatatgattcaaaggtc tttgattttc ggaatttctt 97560 cacaaactgc tcaagggctt cgtaaacgtcggggcgaacc aagttcttct ccttcgtgaa 97620 caccacccgc cttgactccc agtcaaccccgaagttaaaa aagttattaa ggctgacaca 97680 aagcgaagtg aggggaaact catatatttttacaatttcc ccatttctgt ctataagcac 97740 cgaataatat cctctaaaag cgtcgcacaggttgcgcaca gtctccagaa aatccgaaac 97800 cggcgcaaag tcttcaagca catagcaggttagttgaacg gctttttctt cattcaggtg 97860 cgttacggaa tacattaccg gcttgactaccatacttcct ccgttttttt gaccggaaaa 97920 ccgaatcaaa acaccagagt tccattccacacaatactaa ttaccagtat ttaattccct 97980 gttatttcac ataccctctg gatcattctgttttctttct tatatattcc attgtcagtt 98040 gaaaccaaac agatgagcca tgccgaacttcattacaaac atcaggaatt cccgttttaa 98100 ggaagttctg accgaaatgt accattgccatcacgaaagc gagtaccacc ttgagggaaa 98160 tgttttaaat cacacgctta tggtattgcaggtggtagat aagataaccg ctgatcaccg 98220 ggagcaaact aatctatcct taaccgcccttcttcatgat agtgggaaac cctatacccg 98280 tgttgtcgaa aggggaagag taatgttccccggtcatgaa ggggtgtcta cgtatatcgc 98340 tcctcttctg ctgtgtgaag tattgagggattccctcatc acaccaaaag acgccattca 98400 aatcctttac ggcgtcaatt accatatgttgcactggaaa aatccaaacc tttttatgcg 98460 gcttttcacc gaaatggtta attatacctgtttatataac ttcttgaaaa aattcaatca 98520 gtgtgatcta aagggtaggg tttctacaaaaccccaaaag caggaattcc ccgtaatcca 98580 ttattttgag aataccccga tcggtactgttgagcgccat gtttatttta tgatcggggt 98640 tccggggagt ggaaagagca cgtttcttcagaaagttgga gagggggcga ttgtatcccg 98700 tgatgaaatc atgatggaat acgccgctgaaatagggatc acaggagact acaatactgt 98760 tttccgggag attcacaaca accctatgcataaaaccaag gtcaacaacc gctacatgaa 98820 cgctttccgt aaggcggttg aagagaatgaaaaggtattt gtagacgcaa ccaacatgag 98880 ttataagagc cggagacgtt tttacaatgcgcttcggcgg gatattgcgg aaaccgtggg 98940 ttaccattat atcgtaatgc ttcccgattattttacgtgc attgaacgcg ccgaaaatcg 99000 ggaaggaaag tcgatttcaa gggaagtggtaaccgatatt gcgcggagtc tgcttcttcc 99060 gtgcagggaa catcccaaca gcattgatacgacaatttat atgtctgatg ggcatgatga 99120 acatgtgttg agagtagctt ggtagtttttaagattcgac gatgctcccc ctgctcaagc 99180 ggggggattt tttatttaat caaaaagtggaacctttaga gaaactactt gccgttctca 99240 aaaagcttga agcgtttgag gaatatctttcaaagataga tcttggaacg ctggatcagg 99300 tgattacccg gcttagaaag ctccgggaatccaacgaaaa actgtacaaa gactatcttg 99360 aagaacttga aaagcttttc agcaccaatcaggaaacgct cgaaaagctg gtagacgccc 99420 tttcggagtt tacggaagag gaaaaggaaaaacttgaaaa tttccttgag acgcaccaga 99480 aagacgccgc acgacttctt ggatatattgacgtttttga agcgtcgtgg aaacacatga 99540 gcgccgaaca acgggcggca tttgaatcctttatcgacag actcagagaa cttcgtagaa 99600 acctcaatct cgatcgattc aagacggaaacagcattcga tgttttcgac aaagcacgaa 99660 gagaccttgg ggtgccttat gaatatatcaatcggtttgc attagacttt atcagatttc 99720 gccagcgatc agaaattttc ttcaaacagatcatggcatt tttcacctat gaaagggtga 99780 caaggtatac cgcttatgga atggcaattaatctggtaca gggggcgctt gaacgctaca 99840 ttgaaactac tgatgaagtg atcagagttaccggtattta caaccggctg cttagggatc 99900 aggcgcttca gttttaccgc gcaaatattgatctgagacg cttcggtgtt cagctccggg 99960 atactacccg ctttattgcc gaattttatgcatttacgcg cacgcgcgat ccgttcgcgt 100020 tgatagctca gacagcagcg ggtgctggagacaacatcga cggattcatg cgtcagatga 100080 ttctcttgaa tcagcgcctg aatattgacagccgcacgct aacacgtaac atgctacttg 100140 cagctaccac gctggaagat aatatcatgcaacatattca gcttatcact gcttttgcaa 100200 atgaagcaaa tttgagcgct accgaactggtcagcgatct tgttgaaagt tattcagaat 100260 ttgtcgtgct gcttggaagc ggggcgcgtcagatcacgca aactcagatt gcactggcgc 100320 ggtggaatat gtcgctcaga gacggcatgaatattctgaa aggtctctat caatctcagg 100380 aatcggtgat cgactcgctc attcagattcagattctctc gcgccagccg gttgatttcg 100440 aacgcttctt tggagccatg ctcaccggcgacattgaagg aatcgtcgat cagcttgcgg 100500 aaatggcaat gcagatgcgg ggcatgatggatgaactccc catttatcgt atgcagtttg 100560 agcgggcact tgaagggttg ggcttaacctccgagcagat cgcaacgatt cttggaaagt 100620 ccagagagca gcttacaggg ttcggcactatcgtggaaga tttccgccgg aagctttcac 100680 ctgaaatgct tatccagact tttgatgaacttctgagacc caacgaatgg gaagaattga 100740 agaatgcggt ggatgcattc ttcgagacgtttatgcttta cggcgctgaa ctcattcgca 100800 acatgattcc ggtgcttaga attctgacacagggaatgca attgatgttc aagtggtctc 100860 aggctatgac cgatcttatc gataaggttggtagccttgg tggattgttg aaagacaacg 100920 ttcttggaga tttcttcaaa agcattttcgcgtttcttgg acccggtgcc gcgctttacg 100980 ctattgcgaa tatcggaaag cttggaactgcactcaagat gttatttgat ttaatcattt 101040 ccattcctcg ccgcattggg ggaggggttgtaaaccgtat tggttccttt ttcagtcgtc 101100 ttggagatgt gttcaagaag ttcttcggtagccgggaaat gaaacaggtt gcggaggatg 101160 caacttcaag aagaggaatt ctccgtcgaatcaccggtgg agtcaaagat tttaccaaaa 101220 acctctttca aagcttttcg cttggatcgatcgttcgttt tacagcagcg gtgggggtgc 101280 tggttggtgg tatttatctt ttcggaaaagctgtgaaatc acttcaggga atcgactggg 101340 gtgagacttc gaaaggattg cttgccttctttggagcgct taccactacg gtagggttga 101400 tcagtcttgg tggtcttctt tcacttcccgcacttcttac cggtcttgca gcttccatag 101460 gcgctattgc cgtggtggcg ggcggtctctatctcaccgg tgaagctatg ggagtgtttg 101520 ccagcaacct tcagagactt gcttccacgctggaaaccta ccccaatctg acttccggta 101580 tattccgtct ggctggagca cttggtacgcttggtgccgt cggtacgatt gctgctccgg 101640 gcatgctggt tggagctatt accgaagccgtaagtgcggc gatcaaaccc gatgtggctg 101700 taaaagccat tatcgatcca gatgtaattaccgccggaga aaaactgatc gcaaacaaac 101760 ttgaccggat tattgcgctg cttacggaaatgcaaaatag aacggagcca agagtggtta 101820 cgctgaataa gccggaaaag ccggttgaaaaaccaatctt cagcacattt aacttttaat 101880 cttcatcctt ttcttctccc tttttccagaccggatgata ccagtcgttt tctttcatca 101940 ggttgatgta aaatacatat ttataattttcaaatttttc attcatgcga agcttataga 102000 gataatcatc tttgaacttt tggataggtattcttcctcc aacataaagc agaccacctt 102060 ttcccaccac aaagaacata tcttcggggaaacgtttgat ctgaagcact tcgatcacat 102120 cgtcaatggt aaaaaacttg tatgcggtttctttggatgc gttgtaaatg taataatgat 102180 gcgcaacgtt ggattttctg ttatccgtaatcatagaagt ttctgcttaa gtagctaaat 102240 atcactatta aataaccgga tttgatatttaaagaaaaag atgaaattaa ccgatctcag 102300 aaataaagtt acaaacgcat ataaccagatttccaagcag aaccgcgagt taatcgccgc 102360 caaacttcgt aaagactcca gtgccaccatttacttcggg gctgctgtcg aaaaacttga 102420 cgacgctacg atcaaagaac gtatgatcgacgtttttgcc acgatcattg ctcaggcgta 102480 tgatcgcgcg atttccttgc gcaaaggaaaaccgacacat ctaccctccc ctcagtcaat 102540 ggtacttacg ctggcaagat tttacgtggaaaatgaagac attacgctca gcaaacttaa 102600 cgaaatttcc attgcgctgg gctggtatatcgcgctggta aacgaaccga atttgcttca 102660 aaaatacaac ctccccaaac agatcacggaacttgagccg gagcagcttc tgcacactta 102720 caaccagatc gcaagatatt ccgacacctatcaggtggaa ctggtaaatc gctataaaga 102780 aattatcgat ttcctgacgc aaaacggtgaagagttctgg gaaaaagaat acggggttat 102840 tttcaaacct tcctcttacg aaatcaacgccaaagtgctt cagcttgcct ccgatcgtct 102900 ttttatctgc accgcactta accctattttccacgacacc tattatcctt actatgtgct 102960 tgtagtcaac ccggcttaca agggagacggcaatgttcac tatatcaaag acggcgtcaa 103020 gggatattca ggtatggagt tttatcttgccaccttctcc gacaaacacc cctaccagaa 103080 gggattattt gctcagtttc gtagccagtataatatcgcc acccctctca gctttatcga 103140 aagcagactg tatctggggg attttatggagtttttgtgg aaacgtaaag atcttcagcc 103200 gcatatctct aaactcataa acctttacaaacaacacccg gcttatcttt tcgatgagaa 103260 cgcaatgaaa aggtttgtgg aaaatgagcttttcgatttc aaaaatatca acgactcacc 103320 cggcgcacgc gaagccgtag cttatttttattccaagatc gacaaccgtt cttttatcga 103380 ggggctgact ccgctgatcg gagccgccgttgaaacggtt atggaatcgg gagaagaccc 103440 caattacaaa aatgtacttc cggtgctggtagagcttatg gtcaaaaaca actacgctat 103500 gaaaaagatt gaagaagctg taatcgaagcggtgcataga aaagcggaaa acattctcaa 103560 attcaccccg gaagaccata tcagatatatggcaattcat ttcgctcata aaaatattcc 103620 ttcaaattct gaagaagaag gaagagattttgccgaacag atttattata acataatcag 103680 acctcagatt acaggcacct caccgtatgctattatgttt aaacgtttta tatattcaat 103740 cattcttgcc gaaatgaaag gtattctcaaaaacaagata aatcaggtgg ttaaagaaat 103800 ggaagaagaa ttcgggtttg gagacatttcccttgccgat ttcgactggg ggggcggtga 103860 agaagacgaa gattcgttcg aaatggaactttaactggaa gtgtacaccg tttctccgac 103920 gatatagaga cgctgtccac gagatgttcttgaaacctga aaggtggcat tgtaaatgtc 103980 gtgatcttca accacttctt caagctttttcttaatatct cgataaagaa atgacttttc 104040 atccacgtag aagaagtaca tccacgtctgatactcttct ccccccattt tttccgaaga 104100 gtgatcttcc tgatacaggt gaaatccgtaatcctttaaa ataggtatat gttttttgat 104160 ggcgtcggat gaagcctcgc ttccgtcataaaataccttg agaatcatat tctccattcc 104220 ggtggaagga atgacaactt tcccttcttccgtatacgca ccctgaatgg caataagcag 104280 cagccccgga gatatttcat caaagaactggtaggtatcc tcgacggcgc gaatatcggt 104340 aatcaatgta atcttggttt taccttttttgacaaggttg cgaagggcaa aaatggaaag 104400 gaaaacaacc gcttcgagaa cctgatccggagacaaatct tccgatcttc ttatgcttaa 104460 gaaatactcc cggtggtagt ctctcagacgaaatctcaga tcttcctctt tcctcaaaac 104520 cctcgcaatg cgacggtctg tttcaacaatgaagtcggta tgcgttattt ttttctccca 104580 ttctccgtcg aaaaattcta catcgtaaacaataaaacct atcgtgttgt agtggtagtt 104640 gatttctact tcgattcccc atgaactatcaggcgcacga aacagaaagt ttcgcggggt 104700 aagcttccac ttctccagaa ccgcgctttgaaaattgacg aagtttgtaa tgattttcct 104760 gagtgcattc ctcagttcac tcataaatagcgataaagag ttttctcaac cgtttccaat 104820 atctgaagcg actgtttggt tccaaatcgagattcctgaa gcactcttgc actttgaaaa 104880 gaaggaacgg ctacaacatc tatcgctgtgataaagaagt cgtcaacaat ttctacccgt 104940 ttttgctggc ggtaacctac tttggttttaccgctcccac gaagggaaaa cccgaaattg 105000 ataccgtttt caagaagcga tttgaccagatttccgtaag gagttgggag aatgcggaat 105060 tttccgtaca ctttatttcc ttccatccatacgtctaccc actgcacggc aaggcgctca 105120 agcgacacga acccgattcg aaaatcgttctggtaggggt gatccagctc gccgtacatc 105180 tgaccctttt caatctcctg cttcatacgctccactgctt tttttacggc ttccggcgtg 105240 tagagtgtac cattgtcgga aatgacgtcggcttccataa tcagagccgt ataagtttta 105300 tcgttaactt ccatagcggt tattcttcagttttattgtt ttcttcttcc tttttgtctt 105360 cggcgtacag gaaaacggct tccaccgaaggaagcatttc gcgaagccgc gcaagcacat 105420 cggtaagcgc aaccagcttg gaaacacttccgcgagaagc ttcgatcgtg ttgataagct 105480 cgcgaatgta attgcgatag ttttcatccagcgtaatctt gtcaatgatc tcttcgatca 105540 tttcacacaa ttcctcgctc acgcgggcaatatccttttc cagttcttcc gtgatttttt 105600 cggcttcttt caccctcttt tcttcggggatttcataact attctgatcg ttttcctcct 105660 ccattaacac cttttcctct tcatctttcttgtcctgctt gacatcttta cggatttttt 105720 ccacatcacg gctggagtaa cctaccggcgcgtcataccc cgcaatgtcg ccggtggtgg 105780 tcatctcctc gatctgtttc aacgccgattcataaaggtg cgagaggatt gccgccactt 105840 tagcaggttt ccttgcgctc aggaaaagctcagccacacc agccgcgcga taaatgcggg 105900 gatcgatacg ggtttcgtaa acttttgaaagctgtttttc tccaatgatc ttttccagcg 105960 cttcatggaa accgttgtac tggcgctttttgcggtattc gtggacgaaa tagggaaccc 106020 tttcatattc ctgatctatg caggcttcgacaattttggt aatgtaataa acatcgtcag 106080 attccacgaa attcatcaat tcacttgcggaagccgactc cagcaatttc ttgtcttccg 106140 aagaagcctt catgaactgg aatagaaggattatatcctt ctgcttcata acaacctttt 106200 ttttcttaaa taaataaaat cgaaggagaattaaacaact acggtattat aaccccatcc 106260 accgatatag ttgtaattgt ggctggttttatatacgggt aacagtcatg aaatactctg 106320 tttctgatgt ggcgaatttg ttcttgtgtcatatcacccg gtattagagt ggaattcaga 106380 taggctgcat ctttagttat tgctatgtgatacaaaaggg gcgccatttc actgtaagtg 106440 tgtgtatctc taaatataag tctattattcaaataaaatg aaacagtaag atcattacta 106500 agagggtcca tttcataatg aaattcatggaacacaaaat aagaccggtt tggattataa 106560 aatccccatt cagatactgt tatactatgagttactttct ttgttaaaaa tggggtaact 106620 atagtcgaat aagttactcg aatacccagattcccggaag ataatgtaga ataatatatt 106680 gaaataatat tttcaccact taaaattctacctatgtaac taacatctac attgaacacc 106740 gaagagtcat ctgcataacc gtcaagaaatattgtctgaa ataaatggaa tttttctttt 106800 ttaccattta taaaaagttc gatattgtttacagttcccc aataatccaa atcgtaaaaa 106860 agactaaaag aataattggg gaaatgactctgatacgccg cttgactacc accagcatag 106920 ctatctgcaa tataaaattg attccatctacttatagctc cataattgat atttgccctg 106980 gcaataaaag cattatgacc aaaatcatagaaaaaagaag aattaaaaaa ctttgttcta 107040 actgattggg taccatttat tccatctgtatattcggtta ctatataata gcccgaaccc 107100 gtttcaatag tggggtgatc tgttttgattaaataatata catctctgac gctcatactc 107160 cctttatatc ccccagctac cagatattgattggaaactg aaaccgtcaa ccagtcgggt 107220 aaaaagggtg taacatcaac accatttatagccgaaacat aagtgttcaa gacactgatg 107280 ttatcttgat ccgggtcaaa taacgaaaatgaaatggtaa catagccgtt tgcatccggg 107340 gtatataaaa tctggtatgc cataacttaaatgtttttta ttaaatatta caagctgtaa 107400 atgaagactg cagccgcaac cgaagaagtaaacctgtcaa gtctgaacaa ttcgtaaata 107460 aatttaatat tatcacttcc ggcaactccgctgtatgaaa acaccccctg tctgcgtgtg 107520 atgatattgt cggtaagcat ataaacatacatgctgttgg taacataagg agggaaacca 107580 tctggattga tccaattgac aacatagctatgggtttctc ccggagcaat tgatgaagag 107640 ggataagtgt acaccgtaga gttagtaacaaactcaaaat aagatgtact gttatacgta 107700 tcgggtgtgg aaccgctcaa tgtataaaatagagaaaaag tgttgttaat atgagacata 107760 attgaaaggg taaacgaaaa tgtatacggttgtgtcgaaa gagtccccat taaagacgga 107820 taatcaacag atcgactgct cagtatgttgaatgacttac cttcatataa cagcatattg 107880 tacttatact tatgataata atattttcgaatcatagaga ctgtaggagc cgggagagaa 107940 gtactaaaaa cagcgatttc gtaaaatttaaaagcagggt ctccagcatg atctctgata 108000 agcaatctgt taatatctcg ccttgtgtctgtattatact tttcaccaac aaaaactccg 108060 ttgatataaa atgatgttgt agaattgttatgagacactt caaaaagaag aggataagat 108120 aatacatctg agtaagaaaa ctccagaccatagccactac acgtatatgc ggtgctccca 108180 acagaagaac cgctatcaag actgaaaagcactacattta gtttattgtt gttgtttaca 108240 aaccctgttc ctataaagcg gttgttgtctgtactggtat ttaacatcat cacatccata 108300 aactgggaaa tgctatgtag cattgcaggagcaaaaacca taaaaacatg gaatggtgaa 108360 tctgggttat tgtctaacag attaacactaaatgtgttgt tataattata actatttata 108420 tggctggcaa gataagaata gcctttgtcccagtaataat gccaaccttg ccctgttgaa 108480 tataaagata catggcgatt acccgttaaagacggtaaaa cagaaatggt tggggtatat 108540 ccactttgtg ttatagttat atactcgacactccaaattt cacttaaact actcaaagaa 108600 acaatttcaa aattggaagt aaaaagagacagattataaa actttacgtt attatcaaaa 108660 tacacataaa cacccatatc atacacactttcaagggaaa aagacgggtg tgtatggtca 108720 aaagatattg taacatcaat gacacttctggtgccgatgt aatataaaga actggtaaca 108780 accgtaaacc agggcggtgt aacagaaaaagatatattat tttctttcaa tatggaagca 108840 cttactgaaa agggtattcc ataatcatctttcactgaaa aggtgataat gatggagtta 108900 ctgcttatta tgtaatctct gttcatctcattctttaaaa tttaaacctt aaataatcaa 108960 gatcaacttc gggggttgtg gttggataacttttattggc tgctttaagc tcccatttaa 109020 acccggcatt agaaaatggt ggaacttgagaatatgtatc aaacggataa tttgccgcct 109080 tgagttccca tttaaacccg gcattagaaaatggtggaac ttgagaatat gtatcaaacg 109140 gataatttgc cgccttgagt tcccatttaaacccggcatt agaaaatggt ggaacttgag 109200 aatatgtatc atatggtatt tcagatgatacaagttgggt gttaagagca gcggaagggg 109260 ttaaccctaa actctggaag gaatttttcatagaagaatt gctattagtc agataaacgt 109320 tactaacagg cacataattt accaccggagcaatctgtct gacaaatttt gataccgctc 109380 caataaacga cacgcttaca ctcgatttgggagatttgaa aagtccccat gcacttccgg 109440 ttaccgctga aaagaaaagg agattgaccggctcattcca gacgttttcc caatcgactt 109500 ccagcgcacc catttgagga gatgagtagaaagtcagatt gttacccgta cttttgagaa 109560 caacgtttgc tccggatgac gtcagatagaaatactgagt agatgtagag acatatactg 109620 agaagtaaga aagataactt accgtcagaaatcctgaatt aattgggttg aagaaatagg 109680 gtgatgtggt tgtataactg gcatatttcaaataaggtgt attatccgtg taatacagaa 109740 cacttgtcga ataattcaaa tccatgaaaatgcttgccgt tccgaaaaag tcggagaagg 109800 tgactaaata gggtgatttc cgaaccatgtatgttccatt atcagccgaa acgataaaaa 109860 cagattttgt tatatcgtaa agtaatgcgttgaatgtggt gctaccctga ttggaaaatg 109920 ttatataatc ttcaatatat ccgccgccaaggaaaagacg tttgattgag acggtttcgg 109980 tgctggatgt cgtgtaataa tccaaccagaacacatcata tccccatgca gctaccccac 110040 cgtattgaat gctggtggtg gtaacctgtttataacttcc actggaatcc aggagtccca 110100 cttctccgga agatccatcg tgcagattgagaaagagtgc gggagattga tcgtaaaaga 110160 atccaagcga tacgattata tccgaaggggtagtaatatc cgggaagtaa agagagatat 110220 tcgggtctgt agttgtaatg ttcgggaaatacagttctat attaacgtcg gatgtagtaa 110280 tcggaggcgg ttcacttggg gatgggggcggaggaggaac atacgaataa gaggggatca 110340 gaatcttctg cacgaaaaca taagcttcatccagcttgat cggttgagaa aacttcgata 110400 caaacaccac gtccccattc tgattcattccatagattcc gctgacataa ggagaaaccc 110460 ccggaacagc cgtgggattg gtggtaaagtttttggctac aaattcggca acgagaagct 110520 tcagaggctg caattcataa agcgtaagggtaaagaaaag cggggaaata gccggtactt 110580 tctttcccac aaccacaaac gattcatttcgaacaaggat aaaatccggg tgatcataat 110640 aactgtcatt cacaagcgag ataggatttcctccatacgt aatggaatgt atgtgatagt 110700 actgtcgctt gattctcaca atttcaaaagtgagcgcatc aaaatcgata ttcagcttac 110760 gaaaaagatc gataaaaagc gcacgaatctgagctttgag ttgatcatcc ggggcaattt 110820 ccagttttat attatcattt tcgtagctaccgggagaagg gaaaccaaca acttcggtga 110880 atacttttat atcggtttta taaaaatccttcgctctcat tttatacttg atgttcttct 110940 ctattaaata agaaaagttt attcaggggctacctttgat aatattccct gtacacccag 111000 ccggtaaacc caaggtccag aagtttgggaatatcccctt ctcttcccca ctgctttttc 111060 tgaaatacat cccccacgtt cagatcaaaactcataaccg gatcctgtaa atcttcaatg 111120 taacggctgg cgaaaccgtt atctctgatcctcaaaattt cttcatcggt tatctgatca 111180 tgaggaagat cataaagctc tatcagcttacgacaggatt gaatatagcg gttatcttca 111240 taaaaattag gcggatgtgg gttaatccagccggataaaa tatcggaata aaccttaccg 111300 gcaatcagtt cgtgagcttt gtttttcatatcgggttcga gaatatccag aagtatataa 111360 agctcttgat aggttttaac cggctcataatacctgagta tggaagcgat cgccattctg 111420 gtttttgtgt tctcataaaa agcatggagttccggtagaa tatactccgc cgaagaatgt 111480 cttattatat ccacttccgc catattctgagccataccct tttgattgta tgcaattttt 111540 ataacataag gagttccaac tatgcgataaaccttacgag aagaaccacc tcctatatac 111600 tccacagaag gcaattcatc catcaaagctttaagtgctc tgaaggaaag atgattgttc 111660 agcaactctt caaaatactc aaataaatcctgatctaacg atatgttgta accgtaaaca 111720 attttctctt tcattacttc ttcttattgtttttctccgt tcgcttaaga taagcgcgag 111780 ccgatttgaa aacccgttta ttactggtatttgctgcaaa attgatcatt ttcatagccc 111840 gatctctacc cacttttctg ataagggcacgcgccagcga ttcacccgac ttgtacacat 111900 catcaatatc tttgtctttg gggattccaagcacttcatg cattttcccg cgcttgactt 111960 taccgctttt gaatgcttat tgaatccacttttcttcctt tttagcctca gccagtttac 112020 gtttcttgag ttccttgata tacttcagagcgcggtcata aatattgtgt tcagggttga 112080 cgttggcggc aaaaaccaac atccccacggcttctttgta cgaaactttc ttcaggagat 112140 cacgcaccaa tttgcggtga tctttgtaatgatcgacaat atcttcatct tcggggattc 112200 caagcacttc cttcatgtgc ccggcttcacgcttgacctt gctcacccag tctttttccc 112260 gcgcttctgc aacggtttcc ccaccttcatccagaagtgc aatagcttcc gtatagaaga 112320 aagaatcccc ctccgtttct tccagaacgcgcaacgcttc tttcagaagc tccttgtgtt 112380 caatatgcat acgctttgcc attttcaacaaatctttgac aaaagaccgg gcatcataac 112440 cgtaatactt ggcgacagaa acgacggaaggaactccaag atacttacgc aacttcattg 112500 ctatgcgttc agccgttgcc tctggattacccatcttttg aatgagttct tcgcgggtga 112560 cattttcggt cagcagtttc tggcgaaattcgttaagctt ggttctggtc tcctgcaaaa 112620 gccgcgcaac acggagaatt tcctttcggttcatatctta ttctcctttt cttttaatta 112680 aagaaaaata aagactctat gaaaacagaagacagaaaaa aacttgctca ggaaatcctc 112740 gacaaaatcg taaacaaagc catgcagcttgaaacgttga ttgacgatga atacaactat 112800 ctcaacagaa ccagtgtgct ggttgaagaggagagcaatc tgatgtcggc aaaggctcga 112860 atgcttgagc tacatattaa gattctcgacacgctgcaga aagtgtataa agatctgaaa 112920 gaagggattc aggaagaaga cgaaacggaaaagattctca tggagattat caatcagagc 112980 aaggctaacc tgtgaaaaca ggtagttcattcaatttttt agctatattg attccctgag 113040 ctatcacctg atccatgttg taatagttataagtagccag tctacccacc agtataatcc 113100 catggcactc cagttcgttc ttcatagaagccgccttttc tctgtaggtt ttcttgttaa 113160 tcggataggc tttgaacgaa ttttcctgaggatgttgaga gggatactct atggtataaa 113220 ccctgtcaag attcagcctg gagtgatcgataacgcgggt aaagggttct ctatcagagg 113280 aaagatgaaa tccgatggat tcgagcgtactccactcttt cagtttgcga agcacatagg 113340 gatcggaaga tataccttca agtttttctctggtttcaat tctgagatga atgtaaggga 113400 gatgttcttc tttacctgtt actctctggtaaagcctatc aaggtccccg gtataaataa 113460 aagggttatt ttttatgtcg tttagatgatcaagcgcgtc ttcggaataa actatattga 113520 ctaccggcac ataatttctg atataatcaatcatccgtat aatcattttc cagtaaccat 113580 caaccgggag cgccaccatt ttatcgtcaaaataagagtg atatcttttc cagtcggtaa 113640 agaagggaac gcgggaagct accgtttttaccatctcttc atcccagtaa tctccccaca 113700 ccttttttga gtaaggggca taccagttttcatagacgaa agatttaagc ggttccggga 113760 gattgcctac cggaattttt ctgttaagaagttcttcttc cagctcaatt tctccaagat 113820 acagccgcac ccagaaaaga gatgaaggaataaatgaaca tatatcattt tcggtaacag 113880 cataggcatt gtaactgata gagtaaaaggaagagaatcg cgacacaaat ttgatcactt 113940 cgggggaatt ggtatgaaag atatgaaccccgtatcggtg atattttttc ccccggtcaa 114000 aatcccagac gttcccaccg gggtggttgcgcttttcaaa aaaagtaatc tcttcaaatc 114060 tgaagccacg atcaaggaga gaaataacggtgctgagtgc cgcaagacct gttcccccta 114120 caaacaaccg tttcataatt ttctgacctccgggtctctt agcacttgag gaaacatacc 114180 ctgtacctgc tttccgacaa aataatccatgttgacgcgc cgaacggaga atatccgaga 114240 aggtctgtac atatgtgcgt gagataaatacggaagcgtg gcatatctcc acttgaacgc 114300 atgtgccaca tggattaaaa aagtttccgtgtgggttgga agataatcgt aatactttct 114360 cagagattcg agataatacc tgttaataacgtgccccccg tcaagatgca tgaaaacgcc 114420 gcgcttgtgc agtttgtttg aagtgaaataattgtagaca taatcggcat attcattaaa 114480 aagctttaca tcgaaataat cgagaaaatgttcttctctg ataacaatct ctctccagtc 114540 tttttttgcg aaagtctcca cataacatgcggcggcgtaa atgtaatatc cctcaggggt 114600 ttttaacctt tcatcggcaa gatgatcgggaagctcaatt cgctcatacg gatctttgaa 114660 aggtgtttcg ttttgatagg gaacaaccccgtaaggttct accatgttgg agcaaatcat 114720 aaaatattta tcgaaacacc cctcctcgaatgctttatcc aaccctttca gatatgccgg 114780 gggaatataa acatcgtcgt gcacaaacaccccggcttct acaccaagaa gaccaagtgc 114840 gtcaataaat gcagcaaact gagagtactgaatttcatct acaagaaaaa cattaagttt 114900 gtgatgttcc agttgatttt taagccacgatctccaaact tctttatctt ttaccggttg 114960 aaggttgaaa tgctggagca atttataatccgggtgaaag cgcatttccg gataaacaaa 115020 aatataaaaa taatcccgaa ggtgtgaaatgtgtgggtaa atatttttaa gccacacttc 115080 gggaatttcc ccgagcgtta tcattacaaaggcgatttcc ccgcgataaa aagaagggtt 115140 cggggaagaa gaaatcttct caaccttcattgttttccgc aattttaatt acaacccggt 115200 tttcttcaaa gttaagctcg aattccttttccgacgcatt gattccaagt tcttccagtt 115260 ccttttgaag ttcccccatg atcttcatgcgctcccgctt ggctttaagc gcatgctcga 115320 atacgttcac atcgccttca tcgacgatttccaccagcat atcaaagaca agcaccgcca 115380 tttcaatgta ggcgttgcgg agttgcatttcacgaagtgc aagggatttg agctggtcaa 115440 ccttctcttt gtcttctacg acaatggtattttcagagcg tgtaacggta ggttcaccca 115500 ctttaatgaa atctttccct ttcatggctttttttgatga aaaaacgatc gggttcaaga 115560 aaggtttcag aaggtgttaa gcggatattcttcagacgcc tgagcttccg aagccgcatc 115620 gtaatcgcca accgtaatat aaatatcctcaattccaaaa aattggggcg gatagcgaaa 115680 ttcataaagc gtaagttcag acggcgttacaacaatatca aatatccaca caccaccttc 115740 atcatttcca agaataatcg ctttgcagttgatcttatcc gattcggtaa agaaaggttt 115800 gcttctgaga acaaatcttc tccctaccgcaacaaaaagg ttggtgccgc tttccgggta 115860 aaaatcgacg tcgttgtctt cagaatcaacgatggtaaac acattgggtt gaatataaag 115920 cgtataagat ttgttgactc cataagcggtttgtggttcg atgataaagt tctgaggata 115980 ctgtgtttca tagaggatat ttacgtttacgatcatcggc atgaaatccc ttcgatcccc 116040 ttcatccgtt tcatacagcg taaccagatgaaatttgctg tcaaaaatat tccccacttc 116100 gcgaggttga cgaatgatag aaacagttggggagtataca ttataaacca cttcgtcatc 116160 tgaaagggcg aattttccca tcgtacgaagataatcggca atggagattt tctggcgaat 116220 gagctttgca atcaaacgct gaccgtattctgtaagagac gccgtaacaa gcatatcaga 116280 attccacctt cagtttaatc acgtattcatcttcgttggt cttacgaagc ggagaggaaa 116340 gtcttccaag cgccacaagc tgattgttctgatcatagat tccaacggtg gtaatataag 116400 taccctgatc aggatacaaa atgcgtccggaagtgggatc gtagaatgtg ggattgagcg 116460 agtaattgaa ttcgccagcc tttacgcgacagaaaataac catagagtga atgttatcca 116520 ccattgaaag cgtcatgtca agaataaggttggcgatgtt ttcatgaagt ttcgcaacgc 116580 caccgaccgg aggattgctg tcaatcgggaatccggttgg atccagcatg taaagcgaag 116640 tgctgtatcc ccccgatcca accccaagcgaatgagagac aaaatcgccg catgcatcca 116700 gatcgatcag aagcagcgcc gactgagggaatacgattcc aaacaccttc tgagttatgg 116760 agtgggttac aggaaccgcc tttccactctgaagtgaacc cgacaccaga tagtagaacg 116820 gctgcacctg ggtgcggatc ggagtagcgaggttggtttc gccgctgttg tcaaccagag 116880 acagggttgc gcggtttgat ccgtcggtgaaagacaggtt aatctggaaa ttaccaacat 116940 caattgtatc ggcaaaatta tggaaggagactaccatgaa gttcttaagt tcactgaatc 117000 cggcaccggg atttgcggga tcacggggaacagtaataac atcagaagcc agattggttt 117060 tgaattgatt gataaaagcc agataatgctttcgagtaat ttcattagtt ccaccgccat 117120 agttcccttt ggaagcaaac gccacggaaaactccggatc ggtggtaagc gatcctttga 117180 agatattcac atagtaatcg ttgtaatgatcgggttgcga agaataagtt acaaaatcgt 117240 ttcgtgcaat aactccgctt ttatcccagaaccccggaac ggctacattt ctcgtgctgt 117300 acaccacgtc gtcaatttgc ttgaagacaaacccctgagg ctgcggagcc gtctcagctt 117360 ccgtagcgat gttgttgaca acgtttcgtatagtttcgag ctcttccgta gtgatagaca 117420 gattcgattg cagatagttc agaaatccctgcaaaatctg tcgcttggaa taatcggtaa 117480 ccgttccaag tagcgtggta atgtaggaaataagttgctc tctcataatc ttatgtaatc 117540 agttttactt tcagttcagc ttgtgcaccactttcgttgc ccctgataag cacaatggtt 117600 tcggtgttgg gcgtagccgt ttgcgccggttcgatcacaa acttgcgccc ggaaatcgct 117660 ttatagtcgg ggttgcttgt cgggaaggggaacggcgcct gtgtcgacgg ttcagcaaca 117720 gaaatgttta gctgtttgct atcataaagcaatgaataac caagaatctt atcaagtgca 117780 accacaaact ttgtacttgg gatgaagagatatcggcgaa ggttatctgt agaagtaacc 117840 ctgagcacga tttcactctg atccagaagcaaaatgggaa tttcgttgac aatgatattc 117900 ggatcgccta ccacactgaa aagatgatagcgcggcactt caaacggctt gggttctacc 117960 agcgaatagc tcagaatttg cggagggttacccgcgcttg cagaaaacac atattcataa 118020 tcgattccgt cgtcagaaag tgcaaagtaggcaatatcga aactaccgga cgcaaacaac 118080 cttctaccgt atgccgtgag cgtggctacggaatataccg tgttttctga tttggtgggg 118140 ataaacattt ttttctttaa ataaagttttcagttaccca ctacaagttt taaaagctca 118200 accagtatcg aagggtctac aatagccagcgcaatgaagg gaaaaagtgg caaaatcaac 118260 ccgacaataa atccggtgag tagcaatttgatgaacagtt gttgttgatc ttttctgttc 118320 tgcaatgcat tgtcataagc gccctgcaatagatcacgat atttttcacc cacctccgtt 118380 ttatctttaa gagactcgat ttgaacgctcagtcgattca atatatcttc tatatcatca 118440 agtttttctc taatgttgtt tctatcctgtacaacggttc tgagcacatc tttaatatct 118500 ttaatcattt caatgatgtt cgatagaatgaaacgtatct gataatctct gtcaaactgt 118560 tcttcgttca tccctgtttt taaattaaataaaaaagggg agacgacacg tagtacatct 118620 cccccataaa attatttttt atgcgaattaaggcgtgcgc ttttccagaa gatacatcgt 118680 aagaacatcg ccatttacag caagcagtttggtaccttcc cggttgggga aaggattgat 118740 ccggttgaaa tcaatgcggc gcatttcctgacgactatcc ttttcaaaat ggtttctcag 118800 gaaattcaga gcatttagcg tataatacatgtcattttct ttccagtagt gcgcaatgat 118860 cttgcgtttg tggttcagca cgaaatccttttcgcggtgg aattccggaa cggaataaat 118920 gacattatcc accagcgagc agtaatcggcaagatcctga agatgagcat attcagttga 118980 aagaatatca ggaacctcgc tgggaatgtacgtgtcataa tccccgtaaa caattccccg 119040 gatattagcc gcattgtaag tcgaagactcgacaaaatgc gttttcaacc gcttgtctga 119100 gtaatccttg gaaaacagaa atatccggcttggattttca ggatcgtcga acattaaaag 119160 cttttcagcc ttaacaaacc actcatattccgacttgaat accggcttga aatagaacac 119220 ggtgcggaat tgagtggaac gtcgagagcgtcttggtttt gccaacgtac ccatagtctt 119280 tctcctattt tttttttgtt tggttttaaatacattctgt gtaacaataa aaattcataa 119340 aggtttcaaa gatttttaca atttcttatttaaagaaaaa tgcccgcaaa atcaagaaaa 119400 caacagagat atatattcta tctcagaaacaaatatggat caccggaaaa aacccccaag 119460 aaatacaaat ggatatggca caaagattgggagaaactgg aggaagccaa acgtaaaaag 119520 aaaaagaaga aaagacgtaa aaataaacgctcttacctga agccggattc ctattataag 119580 aaaccatacg gttattacgg aatctggtattaccattatg atgacggtgt ggatgatggg 119640 ggagatgcgg gtgatggtgg aagtggtgctggtgtgggtg aagctaaagg tgcaaaacct 119700 gctaagaaat ctaaaaaaga agtgctccgcgatcttgagg tcaaactgca cgacttcaac 119760 aaggagttga aaaaactcct cgagaatctcggattctaaa aaaagaaagc cggggaatca 119820 accccggcat tttttgtttc accaaggtaaatcgttatta tccgtctgac acgactcaat 119880 caattcctct tcttccggtt cttcttcttcataactgtaa aaccatatat gaaagtaata 119940 tccccgccct tcttcccact ctcgcatttgctgctcttcg aagacgtctt gcaaaaaacg 120000 ctttgtcaaa ctcatggctc cctcctatttttggttgaca aatcgattct actggtaata 120060 tacgcattga tcagagaaaa gtcaaatgaagaatctgtta aacagagata tatccgcact 120120 cagcggaagt gtagatgtca acctcagatacaatcttata aagttttttc attagcacca 120180 caaagtcttc ttccggaagc tctaatagcaattcctcatt atatttgaga aggggaagca 120240 tagctttact tccgctaccc accgaataatagggatcacg aatgaacatt gtcgtgaaat 120300 tgtcggatac cacaaacacc ccgtgctgactgattcccat aattttcccg ttcatatccc 120360 cattatcatt cagaagattc aaccccttcagatgatctct ccacttgtac gtaaacgtct 120420 ctacaatcgt gtttttgctg taggattctctgttaaagat aagcggcgag gaaaattttg 120480 caaaggcgtt ctgatagatt acccttccgacaaaaccaag cggaattcga tccaccagtt 120540 cggacggcgt ctgaatatcc agaaaagcggctttggggtc atccctgacc accagcatgc 120600 cgtccattgt ggtggtatag tcaaaaaaaacataccgctt atcggctctg tcaatggcta 120660 ctaccgtgct catgattatt taagggtttgcgttaaggtg tctttcaaat gatacacaat 120720 gtctctaatc tcagcaggta taagttctctgatggtaaaa aattcgtcag gaaatattat 120780 ctgtctcaga ttaatatatc ctattcggttgtgatagcga actccgttcc ccaccgctac 120840 cacaatatgc tcgaaaggtg agcggtgtccgtttttgtaa agtcttcttg caagttttaa 120900 atttttatct aagtcggatt catcagaagcataggaaacg cgggctatac gcgccaccga 120960 tgtaaccagt agtttggaat tcaattcctccggtgaaata actccctgaa gaggatcgac 121020 aatatctccg ggattggctt cgaaagccggggagttgtcg tagatatagc gaatcaggag 121080 cgctattttt cggaattcgg gttgtgcgtcggaagcgcaa cgaagtctga aaaagttgtc 121140 aagcgaatac ggatcggcaa tggaagcgatgacatctgta tatgcatagg gtgaaagtat 121200 tcgattagcg tgttgcttgt gtacattgagtttctcgagc acaaaatgca accccgctga 121260 tgtatataac ccggtatacc aacaccaccttgccagcaca tctttccacc ctcctatttt 121320 tttatcggaa aacattgccc ctgagttttccacaaaatca tctggaacaa aggggttttc 121380 aagtacacgc ttccggtatt ttttcaaggaaatggctctt gtagaagcgg cattcctcga 121440 aaaggcgcga tgtgtattaa attcagccagtatgactgtg ggaatttgaa agcgaaagca 121500 gaagaagata tcgttattcg tcttcgttttgatcagatac cacaccattg ctttctgagt 121560 tttttccatt taaaaccgat ccatttaaacaaagtttcat ttctctgatt tcatctttca 121620 gggtttcaga aaggtgtaat atttcatcccactcgcgagt attgcgctca aaaatctcaa 121680 ctccactgga aacatcactc atcatttttttcctccttgt gtttaatgtt gtggtaaatc 121740 tataatttgc gggtgttgcc agaagcgcatttcatagata ccccactttc ttttaaataa 121800 aagaaaaata acttttttta aaaaattatttaccccggtc gtcaagtctt ccaatatccg 121860 gctctttata gggatgaatg cgatatttcttctggaaaaa ctccttccac tccggcgatt 121920 tttcaaatac gacatccaga ttgtcgataagtgcctttgc caccatgtgt ttgcagattc 121980 tcccccggta gtaatgatcc gggcacgtacatttgaatgt gcgggtgtca aaattgacgc 122040 gcgttacgta ttttttgtag gagccgctttcgagagtatc acgggcgcgt tcaagcagcg 122100 ttctcccctg tttgtctttc tgagtctgacaccatttgag aaaagagagg agttgtttat 122160 atcgcttatc gaacgtttca agccgcttgcgctcgcgttc ttcaatagag cgcttgagtt 122220 cttcagcctt ttgctgtctc tcttttatacttggcggcgg catacgctca gatgtttaaa 122280 attctccgaa gcagactttc gatctgatcttcggtaaaat ctcttttgga tatataaagc 122340 acaagattgc gctccgtgtt gtttttcggtggggtgtttc ctttgagaag gtttctgaaa 122400 tattcgatga actctttttt aagctctttttccgactttc cggttttatt ctgcacaagt 122460 gttttttcgg tggaagtagc ctcttcttcgttgacggcga cattatatat agagagcatg 122520 agaaacatct ctaccgccgg atcggtagcatgacgtgcag ccatatcggc acttcggcgc 122580 tgaatggctt catatagcag tgattcctcgcgggtcattc ttttctgtta aataaatgtg 122640 aatccgtagg aaagggaaaa catcaggggaaagcacacat ataagaacca gaacccagaa 122700 caaaaaatag gaggaaggtt atgggtaagatcgatgtttc gaacatcaaa accgccgttg 122760 ccatctccca gaaagccaat gttcctctttatctgtgggg tggtgtggga atctccaaaa 122820 cccaacaaat ctatcagtat gccaccagcaccaatcaaaa atgtgctgtc gttacggggt 122880 tggcaataga tccaaccgac gtagtgggtcattacattgc cgacttcaat aaacgtatca 122940 cctaccagac caaaccctat ctttatgaactcttcggtga ggaagagcgg ggaatcatct 123000 tccttgacga attcaacaac tcagaaagtgatgtgatggg ggtgtttcta aagcttctcg 123060 acgaaaagag gcttggaagc tacaaactccctgatggaat tcacatcatt gcagccggta 123120 atccccccga actggctcca aatgcttcctcgcttccgct tgccgtcgct actcgatttg 123180 cccatcttta tgtggaagcg gatttcatctcccttaagag atggttgaaa ggagcggaag 123240 atgaagagga ttatgtaaag attttcaatcttgaagtcgg ggaagatgtt gttcagcagg 123300 tgttcgatat tttcgttgac tactgcattgaaaacggtct tttcccggct tcagaagatt 123360 ctcgtagttg cgagtgggag gggagcctgaattaccgcac attgcactat gcagcaaaaa 123420 tcggggctgt atacaaagtt gcttacaaaaatgtatcaaa tcaatcgaca ctgtataatg 123480 taactgtaga aatgatccac ggtctggttggaaccatcgc ttccaacctg atggaacatc 123540 ttgaaaacaa gtggcttcca tcggcaaaagagattctcga aaactatgat attgtgctca 123600 agcatcgaga cgcctatgcc gcccttgcctacaaccttat gagcggcatt caggaagaag 123660 actatccgag gttggtggat ttcatgcaatggttagaaaa gaaaaacgaa cttgtaatgc 123720 ttgcggcgat agtggaatct ttccagtcgttcattccgaa gaaaaggttt ctgacaagcc 123780 ggttcgaata ctacaaccag attttcaaaatcattaatcg atcgctggac gtctataaga 123840 aagtcaaacc cacaaacaac aagtgattgattatggaacc gattgtcgaa aaaaagctct 123900 atgaactgat taactgcatt gtaaaaaatcataccccact cgccatgatt ctttcccgaa 123960 tcaaagtgcg ggtagggggt agggataaatacacactggg actctgcaaa gaacgggaaa 124020 tcattctcag ccggtgtctc tttgatgatgaaatcgttta tcccaaactt gtatttatca 124080 aagaccccga caccggcgag atcgtagactatgatattga agactatgtt gccaaaatcg 124140 atgatgaagg gcggtatcat accctgctggaagaaatcat tcatgccggt ctcatgcacc 124200 ccatgcgtgt agaccggttt cagaaaacatatcaggagct ttttgaaaag aacaagcggc 124260 tggtgaattt tctgtacctt tgtcttgaggttgagcgtca tgcaatacat accgctgtag 124320 ccaacatcga tctacttaag cccgtgttcaaagacaacac gcgggatgaa aagattgtgg 124380 aattcattaa agttattcaa catgatcatcccgatcaaaa gctgtttggg tttacttttg 124440 aaagactgtt tttgaagtat ctcaacgattttgagggagg taaaattgca gcccccgcaa 124500 tttacgatct gatggaatac gacgggaataccgttcccga caagttcata gaagcaatcg 124560 aaaaatctct tcataaaggg aaaaagtatggaaatcagac actggatgaa atctttgaaa 124620 tccggcgtgt ggatcaaaag gggttgcaattgacgcaact cctgaagcag atttgcttcc 124680 gcagggcacg taaaaaaccc tcgctgcacgtgctcgacaa aaagcggaag cactacgaac 124740 cgctcaggtt tgggaaaatc aaagaaaaaacttcaaatat cgccattatt ctggatgtgt 124800 cgggaagtat gcttcgtgat ttcaaaaagcatcgcctgat tgacatcgcg acaagtatga 124860 tcgtggaaac tttcaaaaac gcacccaatatcgatgtata catcggagat accgaaatca 124920 aggataaagc gaagatccgc accctgttttcccgtttcaa agggggcggt ggaaccgata 124980 tgtctaacat ctataaacaa ctgaaagatcgataccagaa aatactggtt gttaccgacg 125040 gggagacacc cttccccgaa ccaaaagactaccgccctca ggatactttt atcatcatta 125100 atgatgaaat gcccgaaatt cccaattacatcaaaaccct gaaggtgaaa ctatgaacga 125160 aaaagcgttc cagttccgca atcttctaaaggaagtgatc ggcatgcgaa tcctcgagcg 125220 attcaaccac atagaacctg aaggaaaaaggaaatgggta attttatccg cctacattct 125280 aatagtggaa gaagaaaatg caccccagatctgcaaggaa cttgttcgaa acaatacaga 125340 gatagatcct ctggaatttg tcagatctttcaaagaagaa cttataaaca tgatcgaaaa 125400 tcaaaattat cgaaatgaat ttgagaaatacgttgcaaac tacgcgatag aaaacgaaat 125460 caattacaga aacatgatag caaactttttctgatataaa aaggaaaacc cccggttcat 125520 caccgggggc ttcctcagcg tctattccctatcgggtaag ttccgccatt acggctgcag 125580 gagcttcaca tacagcagac catagaactctggacgcacc acctccagag cgtagcgggt 125640 catcagacca cgccggtaag agaagttaacgggatcgacg atcgtcggcg tgaacagcag 125700 cggcacatac ggagcgtaaa ccgcacccgtttgccacggc gtgttcagat cttgattacc 125760 catgatgatc accggctggt tctgatagatgttcttgtac aggcggtagc gtccctgcac 125820 catacctaca tagaagatac cggtaccaccatcgcggttg tcgttacccg gcgtaaagcc 125880 cggcatcgac tccagcagcg cagccacctgtgggctggta acaaggaagt tggcacccgc 125940 aaccgccgtc ttctgctgaa tgcggttgctgaccttgttc agttcgatca tcagggtagc 126000 caaccattcc tgcttcgagc cgtagaagttgccggcaaca aagttacccg acgtttcatc 126060 gtagtattca ccgaccactt ccgaccagaagccatagttg tcagtgcgcc gggcatgcgc 126120 catgatcgtc gacaggattt ccagatcgatctcacgggca atatactgag acatgagcgt 126180 aacgatttcg ttttcaagat cgacgcccttatgataggcg gcgagatcct gcatcgcttc 126240 cggcgtccag gcggcacgca gcttacgggtcttggtagcc accggacggc tccgaagctc 126300 aaggttgatc tccggaatat ccagcgactggaatccggga tccggatagt caggatccgt 126360 cgactggtct tcgaagtcgt ttcgagcatcgatgtagtag accagatcaa ggtcttgagt 126420 cgacggagta ccaccggcaa cggtcgcaaagtcagagccg gttacgaaga acaaccgcgc 126480 gtacagcgcc gaaccgacgg caccgacgatccggttgtag cgcggaagcg ggtaagcgac 126540 cgtgttttca ggatcaccgg aagcgtcgtcatattgccag aagcgcaccg tattcacatc 126600 ggcgacaccg ggaagcgacg caaccggcacatccacatag tacaccgcac cgctcgagac 126660 cagtgaagca atgccggtgt caaaaccgacatcccgcatc gtcgcctgct gggcggtagc 126720 cagatccacg gtgatcgtgg tctcatactcccgacgcgac aggcgagcgt tttcatcata 126780 cagaccgccc gtagccgtgt cggtggtcagaccggtacca ccgtagaccg agccgtttcc 126840 gggaagctca ggcgacttga aatccagatagaagaccaga cctgttggga gcgacagcgg 126900 ctgcaccgac accagatccg tcgcacgcaggttggcgaac acacggcgca caatcggaag 126960 tgccagattc cagccgtcaa cctccgtggtctgggtggtc tccatcaggt gcttcttagc 127020 ctcgcggtac tggttctcca gcagggtagcgagcgtatgc cgctcccagt cgttacggca 127080 accctccaga agcggttgcc acttctcgataagttgttcg ttaattttcg gttggctcat 127140 tttactctta tttttttttt gttttaaatctctaaaaaca ttcggtattt aaccgaatta 127200 gagtccggca agacgcttaa tgcgctccagatccaacagc ggatcctcaa ccgaacccga 127260 cgtgcgggat tcggcgacag gtttcaccacacgggtttcc tgcagcttgc gcttgatccg 127320 ctcacgaatg cgctggcggg taacttcgtcaatgacgggc ttcttcaccc ggctttcccg 127380 catacggcgc ggcgtcggac acgaagcagcttccgccatt tcctccgtct cctcaccagc 127440 caccttgcga agcagttcga tagcctcctctacgcgctca agaagcgtgt gcaaaagatc 127500 catttcttca cgcacatgct cctcgctttcgacggcaccc ttgatgtcaa catcaatatc 127560 gagttcttcg tcgtccgaaa gatcggcatcctttaactcg atttcggctt ccagctcgcc 127620 gtcttcgtcg acgtcttcaa catcgattttgatgtcttcg tcatccagat ccagttccag 127680 atcctcctca tcgagatcca gttccagatcttcgtcttcg tgttcggctt cggtaatccg 127740 acgcttcatt ttgtgtttag cttctcgcatttcttcctgt tgtttagttt cccaggcttt 127800 ttgcaacgtt tgcgcgactt cttcggcaaacgaagcaagc gtgtcatcgt cagtttcatc 127860 aaccttcgct tccgtcaggg gattaccttcaggctcttcc acctgctcaa gctcttcttc 127920 gagttccttc aaagcctccc gaatttcttcggtgatcgag tgaatggaag cttcctgaag 127980 atgcctcttc tgcttttcct gacgaagcgactcaaggtaa cccacaaagt ctttaaattc 128040 ctgcataact caaatttaaa tatatttaatcgccattcat tttaaaaaat ggcacctgct 128100 acgtgggaat attaaacccc acatcaaatttaaatatata tttttccgca cgaaataaaa 128160 aaaagggaga gaaactgatc tctcccccaggggtaacatg tattttataa cttgtgaact 128220 accaccacct atagatgtgg atggcttcgtggtcaaggta gctcttgcta ccagattccc 128280 cacgctcaaa gggctgttcc atccccgaatttgccaatta ttggctaatt aaatcacttt 128340 cttcttcaca agcgcctcgc gcaatctatggcgtgctctg ttgattcgag atttgacggt 128400 tccgatagga atgttgtttt tctcagccagcgcctgcatg gcacattttt ccttcaactc 128460 catgtactgt ttcattatat tgtaaaacgggttatcgcct ttttcaagtt cttcctgaat 128520 cacctccacc gcacgtttca attcatactgctcatcaagc agcggctctc cggattcaat 128580 ttctacaggg gtatcccgat ctccaaatgtaatttcttcc atgtaaatcc tgggaccacg 128640 tcgagaattg tatcgataac gggtaattacaacggatttg aatacagtgt aaatatacgt 128700 agcgaaagaa gctccttcaa ccacgctataggaatcaaat ctaagaagcc ttagaaacgt 128760 atcctgcacc atatcctcga tttcatgctccgactttgta tatttctttc cgaaattttt 128820 gagccgctca gcatacctac tgtaaagcacttcataacgg atttcgagcg gaacgttctg 128880 aaggaaaagt tcttcgtcgg tcatctgatagtatttacgc ttatccataa cttctcctct 128940 gttttaaagg ttgaaaatga tttcgtagccgtcaatgcgg tcatcagtta tgagcttgtc 129000 ataaagatcg tcaatctttt ctatgttgtcataaatcatg agcatttcat caatatcatg 129060 cacaacaaca gaaatgtaat aaccgttttcttcatcaaaa cagagttcgg cgtagctttt 129120 atgaagatcc cccacacatt ttttgatattctctacaaat acttctacag gataatatct 129180 aaccagacaa aaatcgagat gcggaagcacactgttcatt atctgctcca gcacctcttc 129240 attagaaatc agcagttcat aatcgtcaatggtatttaca ttgtagacaa gatggttgtt 129300 ggatgtaata gtaggaatga cttcgtgagttgggttgaaa tatgtgttta ccagatcttg 129360 aacggaatcg agcgcgtcgc gataatcaattttagtctcc atagccatat cctaaaaggt 129420 tggttgatat ataccggtta tggaaaataaaaaaggggag agagggtgtt tcctttccca 129480 caccctctct ccgcgtttat agggaaagtcgtcatcgaag tagccctcaa tctcctccat 129540 cgaatgagat tctttttcgg gagagattcacctttttctt ggggggatta caaaattcct 129600 cttccgcggg atcaaagccc ctaatggagaatctcttcct ggcgggagcg ggttcatgct 129660 tccccaatca ttgttatttc cccagacaaacacatggacg tctcttccac caacaaaacg 129720 aacttctccc gaatcggcgg agatgttcgggagttcaccg cgaacccgga tcagggatcc 129780 caccccggct ttccgcacaa gatcggaaggtcccgttgta aaagtaaatg ttgaaccact 129840 cggtgctttc ttgctgaccg tcccaccgacggaccgccag ccgtacggtg gtataactgt 129900 ttccgcta 129908 2 892 PRTVaccinia virus (strain Copenhagen) 2 Gln Asn Ala Thr Met Asp Glu Phe LeuAsn Ile Ser Trp Phe Tyr Ile 1 5 10 15 Ser Asn Gly Ile Ser Pro Asp GlyCys Tyr Ser Leu Asp Glu Gln Tyr 20 25 30 Leu Thr Lys Ile Asn Asn Gly CysTyr His Cys Asp Asp Pro Arg Asn 35 40 45 Cys Phe Ala Lys Lys Ile Pro ArgPhe Asp Ile Pro Arg Ser Tyr Leu 50 55 60 Phe Leu Asp Ile Glu Cys His PheAsp Lys Lys Phe Pro Ser Val Phe 65 70 75 80 Ile Asn Pro Ile Ser His ThrSer Tyr Cys Tyr Ile Asp Leu Ser Gly 85 90 95 Lys Arg Leu Leu Phe Thr LeuIle Asn Glu Glu Met Leu Thr Glu Gln 100 105 110 Glu Ile Gln Glu Ala ValAsp Arg Gly Cys Leu Arg Ile Gln Ser Leu 115 120 125 Met Glu Met Asp TyrGlu Arg Glu Leu Val Leu Cys Ser Glu Ile Val 130 135 140 Leu Leu Arg IleAla Lys Gln Leu Leu Glu Leu Thr Phe Asp Tyr Val 145 150 155 160 Val ThrPhe Asn Gly His Asn Phe Asp Leu Arg Tyr Ile Thr Asn Arg 165 170 175 LeuGlu Leu Leu Thr Gly Glu Lys Ile Ile Phe Arg Ser Pro Asp Lys 180 185 190Lys Glu Ala Val His Leu Cys Ile Tyr Glu Arg Asn Gln Ser Ser His 195 200205 Lys Gly Val Gly Gly Met Ala Asn Thr Thr Phe His Val Asn Asn Asn 210215 220 Asn Gly Thr Ile Phe Phe Asp Leu Tyr Ser Phe Ile Gln Lys Ser Glu225 230 235 240 Lys Leu Asp Ser Tyr Lys Leu Asp Ser Ile Ser Lys Asn AlaPhe Ser 245 250 255 Cys Met Gly Lys Val Leu Asn Arg Gly Val Arg Glu MetThr Phe Ile 260 265 270 Gly Asp Asp Thr Thr Asp Ala Lys Gly Lys Ala AlaAla Phe Ala Lys 275 280 285 Val Leu Thr Thr Gly Asn Tyr Val Thr Val AspGlu Asp Ile Ile Cys 290 295 300 Lys Val Ile Arg Lys Asp Ile Trp Glu AsnGly Phe Lys Val Val Leu 305 310 315 320 Leu Cys Pro Thr Leu Pro Asn AspThr Tyr Lys Leu Ser Phe Gly Lys 325 330 335 Asp Asp Val Asp Leu Ala GlnMet Tyr Lys Asp Tyr Asn Leu Asn Ile 340 345 350 Ala Leu Asp Met Ala ArgTyr Cys Ile His Asp Ala Cys Leu Cys Gln 355 360 365 Tyr Leu Trp Glu TyrTyr Gly Val Glu Thr Lys Thr Asp Ala Gly Ala 370 375 380 Ser Thr Tyr ValLeu Pro Gln Ser Met Val Phe Glu Tyr Arg Ala Ser 385 390 395 400 Thr ValIle Lys Gly Pro Leu Leu Lys Leu Leu Leu Glu Thr Lys Thr 405 410 415 IleLeu Val Arg Ser Glu Thr Lys Gln Lys Phe Pro Tyr Glu Gly Gly 420 425 430Lys Val Phe Ala Pro Lys Gln Lys Met Phe Ser Asn Asn Val Leu Ile 435 440445 Phe Asp Tyr Asn Ser Leu Tyr Pro Asn Val Cys Ile Phe Gly Asn Leu 450455 460 Ser Pro Glu Thr Leu Val Gly Val Val Val Ser Thr Asn Arg Leu Glu465 470 475 480 Glu Glu Ile Asn Asn Gln Leu Leu Leu Gln Lys Tyr Pro ProPro Arg 485 490 495 Tyr Ile Thr Val His Cys Glu Pro Arg Leu Pro Asn LeuIle Ser Glu 500 505 510 Ile Ala Ile Phe Asp Arg Ser Ile Glu Gly Thr IlePro Arg Leu Leu 515 520 525 Arg Thr Phe Leu Ala Glu Arg Ala Arg Tyr LysLys Met Leu Lys Gln 530 535 540 Ala Thr Ser Ser Thr Glu Lys Ala Ile TyrAsp Ser Met Gln Tyr Thr 545 550 555 560 Tyr Lys Ile Val Ala Asn Ser ValTyr Gly Leu Met Gly Phe Arg Asn 565 570 575 Ser Ala Leu Tyr Ser Tyr AlaSer Ala Lys Ser Cys Thr Ser Ile Gly 580 585 590 Arg Arg Met Ile Leu TyrLeu Glu Ser Val Leu Asn Gly Ala Glu Leu 595 600 605 Ser Asn Gly Met LeuArg Phe Ala Asn Pro Leu Ser Asn Pro Phe Tyr 610 615 620 Met Asp Asp ArgAsp Ile Asn Pro Ile Val Lys Thr Ser Leu Pro Ile 625 630 635 640 Asp TyrArg Phe Arg Phe Arg Ser Val Tyr Gly Asp Thr Asp Ser Val 645 650 655 PheThr Glu Ile Asp Ser Gln Asp Val Asp Lys Ser Ile Glu Ile Ala 660 665 670Lys Glu Leu Glu Arg Leu Ile Asn Asn Arg Val Leu Phe Asn Asn Phe 675 680685 Lys Ile Glu Phe Glu Ala Val Tyr Lys Asn Leu Ile Met Gln Ser Lys 690695 700 Lys Lys Tyr Thr Thr Met Lys Tyr Ser Ala Ser Ser Asn Ser Lys Ser705 710 715 720 Val Pro Glu Arg Ile Asn Lys Gly Thr Ser Glu Thr Arg ArgAsp Val 725 730 735 Ser Lys Phe His Lys Asn Met Ile Lys Thr Tyr Lys ThrArg Leu Ser 740 745 750 Glu Met Leu Ser Glu Gly Arg Met Asn Ser Asn GlnVal Cys Ile Asp 755 760 765 Ile Leu Arg Ser Leu Glu Thr Asp Leu Arg SerGlu Phe Asp Ser Arg 770 775 780 Ser Ser Pro Leu Glu Leu Phe Met Leu SerArg Met His His Ser Asn 785 790 795 800 Tyr Lys Ser Ala Asp Asn Pro AsnMet Tyr Leu Val Thr Glu Tyr Asn 805 810 815 Lys Asn Asn Pro Glu Thr IleGlu Leu Gly Glu Arg Tyr Tyr Phe Ala 820 825 830 Tyr Ile Cys Pro Ala AsnVal Pro Trp Thr Lys Lys Leu Val Asn Ile 835 840 845 Lys Thr Tyr Glu ThrIle Ile Asp Arg Ser Phe Lys Leu Gly Ser Asp 850 855 860 Gln Arg Ile PheTyr Glu Val Tyr Phe Lys Arg Leu Thr Ser Glu Ile 865 870 875 880 Val AsnLeu Leu Asp Asn Lys Val Leu Cys Ile Ser 885 890 3 892 PRT Vaccinia virus(strain WR) 3 Gln Asn Ala Thr Met Asp Glu Phe Leu Asn Ile Ser Trp PheTyr Ile 1 5 10 15 Ser Asn Gly Ile Ser Pro Asp Gly Cys Tyr Ser Leu AspGlu Gln Tyr 20 25 30 Leu Thr Lys Ile Asn Asn Gly Cys Tyr His Cys Asp AspPro Arg Asn 35 40 45 Cys Phe Ala Lys Lys Ile Pro Arg Phe Asp Ile Pro ArgSer Tyr Leu 50 55 60 Phe Leu Asp Ile Glu Cys His Phe Asp Lys Lys Phe ProSer Val Phe 65 70 75 80 Ile Asn Pro Ile Ser His Thr Ser Tyr Cys Tyr IleAsp Leu Ser Gly 85 90 95 Lys Arg Leu Leu Phe Thr Leu Ile Asn Glu Glu MetLeu Thr Glu Gln 100 105 110 Glu Ile Gln Glu Ala Val Asp Arg Gly Cys LeuArg Ile Gln Ser Leu 115 120 125 Met Glu Met Asp Tyr Glu Arg Glu Leu ValLeu Cys Ser Glu Ile Val 130 135 140 Leu Leu Arg Ile Ala Lys Gln Leu LeuGlu Leu Thr Phe Asp Tyr Val 145 150 155 160 Val Thr Phe Asn Gly His AsnPhe Asp Leu Arg Tyr Ile Thr Asn Arg 165 170 175 Leu Glu Leu Leu Thr GlyGlu Lys Ile Ile Phe Arg Ser Pro Asp Lys 180 185 190 Lys Glu Ala Val TyrLeu Cys Ile Tyr Glu Arg Asn Gln Ser Ser His 195 200 205 Lys Gly Val GlyGly Met Ala Asn Thr Thr Phe His Val Asn Asn Asn 210 215 220 Asn Gly ThrIle Phe Phe Asp Leu Tyr Ser Phe Ile Gln Lys Ser Glu 225 230 235 240 LysLeu Asp Ser Tyr Lys Leu Asp Ser Ile Ser Lys Asn Ala Phe Ser 245 250 255Cys Met Gly Lys Val Leu Asn Arg Gly Val Arg Glu Met Thr Phe Ile 260 265270 Gly Asp Asp Thr Thr Asp Ala Lys Gly Lys Ala Ala Ala Phe Ala Lys 275280 285 Val Leu Thr Thr Gly Asn Tyr Val Thr Val Asp Glu Asp Ile Ile Cys290 295 300 Lys Val Ile Arg Lys Asp Ile Trp Glu Asn Gly Phe Lys Val ValLeu 305 310 315 320 Leu Cys Pro Thr Leu Pro Asn Asp Thr Tyr Lys Leu SerPhe Gly Lys 325 330 335 Asp Asp Val Asp Leu Ala Gln Met Tyr Lys Asp TyrAsn Leu Asn Ile 340 345 350 Ala Leu Asp Met Ala Arg Tyr Cys Ile His AspAla Cys Leu Cys Gln 355 360 365 Tyr Leu Trp Glu Tyr Tyr Gly Val Glu ThrLys Thr Asp Ala Gly Ala 370 375 380 Ser Thr Tyr Val Leu Pro Gln Ser MetVal Phe Glu Tyr Arg Ala Ser 385 390 395 400 Thr Val Ile Lys Gly Pro LeuLeu Lys Leu Leu Leu Glu Thr Lys Thr 405 410 415 Ile Leu Val Arg Ser GluThr Lys Gln Lys Phe Pro Tyr Glu Gly Gly 420 425 430 Lys Val Phe Ala ProLys Gln Lys Met Phe Ser Asn Asn Val Leu Ile 435 440 445 Phe Asp Tyr AsnSer Leu Tyr Pro Asn Val Cys Ile Phe Gly Asn Leu 450 455 460 Ser Pro GluThr Leu Val Gly Val Val Val Ser Thr Asn Arg Leu Glu 465 470 475 480 GluGlu Ile Asn Asn Gln Leu Leu Leu Gln Lys Tyr Pro Pro Pro Arg 485 490 495Tyr Ile Thr Val His Cys Glu Pro Arg Leu Pro Asn Leu Ile Ser Glu 500 505510 Ile Ala Ile Phe Asp Arg Ser Ile Glu Gly Thr Ile Pro Arg Leu Leu 515520 525 Arg Thr Phe Leu Ala Glu Arg Ala Arg Tyr Lys Lys Met Leu Lys Gln530 535 540 Ala Thr Ser Ser Thr Glu Lys Ala Ile Tyr Asp Ser Met Gln TyrThr 545 550 555 560 Tyr Lys Ile Val Ala Asn Ser Val Tyr Gly Leu Met GlyPhe Arg Asn 565 570 575 Ser Ala Leu Tyr Ser Tyr Ala Ser Ala Lys Ser CysThr Ser Ile Gly 580 585 590 Arg Arg Met Ile Leu Tyr Leu Glu Ser Val LeuAsn Gly Ala Glu Leu 595 600 605 Ser Asn Gly Met Leu Arg Phe Ala Asn ProLeu Ser Asn Pro Phe Tyr 610 615 620 Met Asp Asp Arg Asp Ile Asn Pro IleVal Lys Thr Ser Leu Pro Ile 625 630 635 640 Asp Tyr Arg Phe Arg Phe ArgSer Val Tyr Gly Asp Thr Asp Ser Val 645 650 655 Phe Thr Glu Ile Asp SerGln Asp Val Asp Lys Ser Ile Glu Ile Ala 660 665 670 Lys Glu Leu Glu ArgLeu Ile Asn Asn Arg Val Leu Phe Asn Asn Phe 675 680 685 Lys Ile Glu PheGlu Ala Val Tyr Lys Asn Leu Ile Met Gln Ser Lys 690 695 700 Lys Lys TyrThr Thr Met Lys Tyr Ser Ala Ser Ser Asn Ser Lys Ser 705 710 715 720 ValPro Glu Arg Ile Asn Lys Gly Thr Ser Glu Thr Arg Arg Asp Val 725 730 735Ser Lys Phe His Lys Asn Met Ile Lys Thr Tyr Lys Thr Arg Leu Ser 740 745750 Glu Met Leu Ser Glu Gly Arg Met Asn Ser Asn Gln Val Cys Ile Asp 755760 765 Ile Leu Arg Ser Leu Glu Thr Asp Leu Arg Ser Glu Phe Asp Ser Arg770 775 780 Ser Ser Pro Leu Glu Leu Phe Met Leu Ser Arg Met His His SerAsn 785 790 795 800 Tyr Lys Ser Ala Asp Asn Pro Asn Met Tyr Leu Val ThrGlu Tyr Asn 805 810 815 Lys Asn Asn Pro Glu Thr Ile Glu Leu Gly Glu ArgTyr Tyr Phe Ala 820 825 830 Tyr Ile Cys Pro Ala Asn Val Pro Trp Thr LysLys Leu Val Asn Ile 835 840 845 Lys Thr Tyr Glu Thr Ile Ile Asp Arg SerPhe Lys Leu Gly Ser Asp 850 855 860 Gln Arg Ile Phe Tyr Glu Val Tyr PheLys Arg Leu Thr Ser Glu Ile 865 870 875 880 Val Asn Leu Leu Asp Asn LysVal Leu Cys Ile Ser 885 890 4 891 PRT Variola virus 4 Gln Asn Ala ThrMet Asp Glu Phe Leu Asn Ile Ser Trp Phe Tyr Ile 1 5 10 15 Ser Asn GlyIle Ser Pro Asp Gly Cys Tyr Ser Leu Asp Asp Gln Tyr 20 25 30 Leu Thr LysIle Asn Asn Gly Cys Tyr His Cys Gly Asp Pro Arg Asn 35 40 45 Cys Phe AlaLys Glu Ile Pro Arg Phe Asp Ile Pro Arg Ser Tyr Leu 50 55 60 Phe Leu AspIle Glu Cys His Phe Asp Lys Lys Phe Pro Ser Val Phe 65 70 75 80 Ile AsnPro Ile Ser His Thr Ser Tyr Cys Tyr Ile Asp Leu Ser Gly 85 90 95 Lys ArgLeu Leu Phe Thr Leu Ile Asn Glu Glu Met Leu Thr Glu Gln 100 105 110 GluIle Gln Glu Ala Val Asp Arg Gly Cys Leu Arg Ile Gln Ser Leu 115 120 125Met Glu Met Asp Tyr Glu Arg Glu Leu Val Leu Cys Ser Glu Ile Val 130 135140 Leu Leu Gln Ile Ala Lys Gln Leu Leu Glu Leu Thr Phe Asp Tyr Ile 145150 155 160 Val Thr Phe Asn Gly His Asn Phe Asp Leu Arg Tyr Ile Thr AsnArg 165 170 175 Leu Glu Leu Leu Thr Gly Glu Lys Ile Ile Phe Arg Ser ProAsp Lys 180 185 190 Lys Glu Ala Val His Leu Cys Ile Tyr Glu Arg Asn GlnSer Ser His 195 200 205 Lys Gly Val Gly Gly Met Ala Asn Thr Thr Phe HisVal Asn Asn Asn 210 215 220 Asn Gly Thr Ile Phe Phe Asp Leu Tyr Ser PheIle Gln Lys Ser Glu 225 230 235 240 Lys Leu Asp Ser Tyr Lys Leu Asp SerIle Ser Lys Asn Ala Phe Ser 245 250 255 Cys Met Gly Lys Val Leu Asn ArgGly Val Arg Glu Met Thr Phe Ile 260 265 270 Gly Asp Asp Thr Thr Asp AlaLys Gly Lys Ala Ala Val Phe Ala Lys 275 280 285 Val Leu Thr Thr Gly AsnTyr Val Thr Val Asp Asp Ile Ile Cys Lys 290 295 300 Val Ile His Lys AspIle Trp Glu Asn Gly Phe Lys Val Val Leu Ser 305 310 315 320 Cys Pro ThrLeu Thr Asn Asp Thr Tyr Lys Leu Ser Phe Gly Lys Asp 325 330 335 Asp ValAsp Leu Ala Gln Met Tyr Lys Asp Tyr Asn Leu Asn Ile Ala 340 345 350 LeuAsp Met Ala Arg Tyr Cys Ile His Asp Ala Cys Leu Cys Gln Tyr 355 360 365Leu Trp Glu Tyr Tyr Gly Val Glu Thr Lys Thr Asp Ala Gly Ala Ser 370 375380 Thr Tyr Val Leu Pro Gln Ser Met Val Phe Gly Tyr Lys Ala Ser Thr 385390 395 400 Val Ile Lys Gly Pro Leu Leu Lys Leu Leu Leu Glu Thr Lys ThrIle 405 410 415 Leu Val Arg Ser Glu Thr Lys Gln Lys Phe Pro Tyr Glu GlyGly Lys 420 425 430 Val Phe Ala Pro Lys Gln Lys Met Phe Ser Asn Asn ValLeu Ile Phe 435 440 445 Asp Tyr Asn Ser Leu Tyr Pro Asn Val Cys Ile PheGly Asn Leu Ser 450 455 460 Pro Glu Thr Leu Val Gly Val Val Val Ser SerAsn Arg Leu Glu Glu 465 470 475 480 Glu Ile Asn Asn Gln Leu Leu Leu GlnLys Tyr Pro Pro Pro Arg Tyr 485 490 495 Ile Thr Val His Cys Glu Pro ArgLeu Pro Asn Leu Ile Ser Glu Ile 500 505 510 Ala Ile Phe Asp Arg Ser IleGlu Gly Thr Ile Pro Arg Leu Leu Arg 515 520 525 Thr Phe Leu Ala Glu ArgAla Arg Tyr Lys Lys Met Leu Lys Gln Ala 530 535 540 Thr Ser Ser Thr GluLys Ala Ile Tyr Asp Ser Met Gln Tyr Thr Tyr 545 550 555 560 Lys Ile IleAla Asn Ser Val Tyr Gly Leu Met Gly Phe Arg Asn Ser 565 570 575 Ala LeuTyr Ser Tyr Ala Ser Ala Lys Ser Cys Thr Ser Ile Gly Arg 580 585 590 ArgMet Ile Leu Tyr Leu Glu Ser Val Leu Asn Gly Ala Glu Leu Ser 595 600 605Asn Gly Met Leu Arg Phe Ala Asn Pro Leu Ser Asn Pro Phe Tyr Met 610 615620 Asp Asp Arg Asp Ile Asn Pro Ile Val Lys Thr Ser Leu Pro Ile Asp 625630 635 640 Tyr Arg Phe Arg Phe Arg Ser Val Tyr Gly Asp Thr Asp Ser ValPhe 645 650 655 Thr Glu Ile Asp Ser Gln Asp Val Asp Lys Ser Ile Glu IleAla Lys 660 665 670 Glu Leu Glu Arg Leu Ile Asn Ser Arg Val Leu Phe AsnAsn Phe Lys 675 680 685 Ile Glu Phe Glu Ala Val Tyr Lys Asn Leu Ile MetGln Ser Lys Lys 690 695 700 Lys Tyr Thr Thr Met Lys Tyr Ser Ala Ser SerAsn Ser Lys Ser Val 705 710 715 720 Pro Glu Arg Ile Asn Lys Gly Thr SerGlu Thr Arg Arg Asp Val Ser 725 730 735 Lys Phe His Lys Asn Met Ile LysIle Tyr Lys Thr Arg Leu Ser Glu 740 745 750 Met Leu Ser Glu Gly Arg MetAsn Ser Asn Gln Val Cys Ile Asp Ile 755 760 765 Leu Arg Ser Leu Glu ThrAsp Leu Arg Ser Glu Phe Asp Ser Arg Ser 770 775 780 Ser Pro Leu Glu LeuPhe Met Leu Ser Arg Met His His Leu Asn Tyr 785 790 795 800 Lys Ser AlaAsp Asn Pro Asn Met Tyr Leu Val Thr Glu Tyr Asn Lys 805 810 815 Asn AsnPro Glu Thr Ile Glu Leu Gly Glu Arg Tyr Tyr Phe Ala Tyr 820 825 830 IleCys Pro Ala Asn Val Pro Trp Thr Lys Lys Leu Val Asn Ile Lys 835 840 845Thr Tyr Glu Thr Ile Ile Asp Arg Ser Phe Lys Leu Gly Ser Asp Gln 850 855860 Arg Ile Phe Tyr Glu Val Tyr Phe Lys Arg Leu Thr Ser Glu Ile Val 865870 875 880 Asn Leu Leu Asp Asn Lys Val Leu Cys Ile Ser 885 890 5 874PRT Fowlpox virus 5 Glu Lys Gln Tyr Leu Gln Glu Tyr Leu Asp Ile Thr TrpPhe Tyr Leu 1 5 10 15 Leu Asn Asn Ile Thr Pro Asp Gly Cys Tyr Lys IleAsp Ile Glu His 20 25 30 Leu Thr Pro Ile Lys Lys Asp Cys Tyr His Cys AspAsp Val Ser Lys 35 40 45 Val Phe Ile Gln Glu Ile Pro Ile Phe Glu Val LysPhe Thr Tyr Leu 50 55 60 Leu Phe Asp Ile Glu Cys Gln Phe Asp Lys Lys PhePro Ser Val Phe 65 70 75 80 Val Asn Pro Ile Ser His Ile Ser Cys Trp IleIle Asp Lys Val Thr 85 90 95 Glu Tyr Lys Phe Thr Leu Ile Asn Thr Asp IleLeu Pro Asp Lys Glu 100 105 110 Pro Ser Ile Leu His His Lys Asp Phe SerPro Lys Asp Arg Ile Thr 115 120 125 Tyr Cys Thr Glu Ile Val Met Leu LeuIle Met Lys Lys Ile Leu Glu 130 135 140 His Arg Phe Asp Phe Val Ile ThrPhe Asn Gly Asn Asn Phe Asp Ile 145 150 155 160 Arg Tyr Ile Ser Gly ArgLeu Glu Ile Leu Glu Lys Ser Phe Ile Tyr 165 170 175 Phe Ser Leu Pro AspAla Thr Glu Thr Val Lys Leu Lys Ile Phe Glu 180 185 190 Arg Phe Val ThrGly Gly Thr Phe Thr Asn Lys Thr Tyr His Ile Asn 195 200 205 Asn Asn AsnGly Val Met Phe Phe Asp Leu Tyr Ala Phe Ile Gln Lys 210 215 220 Thr GluArg Leu Asp Ser Tyr Lys Leu Asp Ser Ile Ser Lys Asn Ile 225 230 235 240Phe Asn Cys Asn Val Ala Ile Lys Glu Ile Asp Asp Thr Ile Leu Thr 245 250255 Leu Glu Ala Thr Val Lys Asp Asn Ser Lys Asp Lys Leu Ser Ile Phe 260265 270 Ser Arg Val Leu Glu Thr Gly Asn Tyr Ile Thr Ile Gly Asp Asn Asn275 280 285 Val Ser Lys Ile Val Tyr Lys Asp Ile Asn Gln Asp Ser Phe IleIle 290 295 300 Lys Val Ile Ser Asn Arg Asp Tyr Glu Ile Gly Ser Val HisAsn Ile 305 310 315 320 Ser Phe Gly Lys Asp Asp Val Asp Leu Lys Asp MetTyr Lys Asn Tyr 325 330 335 Asn Leu Glu Ile Ala Leu Asp Met Glu Arg TyrCys Ile His Asp Ala 340 345 350 Cys Leu Cys Lys Tyr Ile Trp Asp Tyr TyrArg Val Pro Ser Lys Ile 355 360 365 Asn Ala Ala Ser Ser Thr Tyr Leu LeuPro Gln Ser Leu Ala Leu Glu 370 375 380 Tyr Arg Ala Ser Thr Leu Ile LysGly Pro Leu Leu Lys Leu Leu Leu 385 390 395 400 Glu Glu Arg Val Ile TyrThr Arg Lys Ile Thr Lys Val Arg Tyr Pro 405 410 415 Tyr Ile Gly Gly LysVal Phe Leu Pro Ser Gln Lys Thr Phe Glu Asn 420 425 430 Asn Val Met IlePhe Asp Tyr Asn Ser Leu Tyr Pro Asn Val Cys Ile 435 440 445 Tyr Gly AsnLeu Ser Pro Glu Lys Leu Val Cys Ile Leu Leu Asn Ser 450 455 460 Asn LysLeu Glu Ser Glu Ile Asn Met Arg Thr Ile Lys Ser Lys Tyr 465 470 475 480Pro Tyr Pro Glu Tyr Val Cys Val Ser Cys Glu Ser Arg Leu Ser Asp 485 490495 Tyr Tyr Ser Glu Ile Ile Val Tyr Asp Arg Arg Glu Lys Gly Ile Ile 500505 510 Pro Lys Leu Leu Glu Met Phe Ile Gly Lys Arg Lys Glu Tyr Lys Asn515 520 525 Leu Leu Lys Thr Ala Ser Thr Thr Ile Glu Ser Thr Leu Tyr AspSer 530 535 540 Leu Gln Tyr Ile Tyr Lys Ile Ile Ala Asn Ser Val Tyr GlyLeu Met 545 550 555 560 Gly Phe Ser Asn Ser Thr Leu Tyr Ser Tyr Ser SerAla Lys Thr Cys 565 570 575 Thr Thr Ile Gly Arg Asn Met Ile Thr Tyr LeuAsp Ser Ile Met Asn 580 585 590 Gly Ala Val Trp Glu Asn Asp Lys Leu IleLeu Ala Asp Phe Pro Arg 595 600 605 Asn Ile Phe Ser Gly Glu Thr Met PheAsn Lys Glu Leu Glu Val Pro 610 615 620 Asn Met Asn Glu Ser Phe Lys PheArg Ser Val Tyr Gly Asp Thr Asp 625 630 635 640 Ser Ile Phe Ser Glu IleSer Thr Lys Asp Ile Glu Lys Thr Ala Lys 645 650 655 Ile Ala Lys His LeuGlu His Ile Ile Asn Thr Lys Ile Leu His Ala 660 665 670 Asn Phe Lys IleGlu Phe Glu Ala Ile Tyr Thr Gln Leu Ile Leu Gln 675 680 685 Ser Lys LysLys Tyr Thr Thr Ile Lys Tyr Leu Ala Asn Tyr Lys Pro 690 695 700 Gly AspLys Pro Ile Arg Val Asn Lys Gly Thr Ser Glu Thr Arg Arg 705 710 715 720Asp Val Ala Leu Phe His Lys His Met Ile Gln Arg Tyr Lys Asp Met 725 730735 Leu Met Lys Leu Leu Met Gln Ser Lys Gly Gln Gln Glu Ile Thr Arg 740745 750 Leu Ile Leu Gln Ser Leu Glu Thr Asp Met Ile Ser Glu Phe Thr His755 760 765 Asn Arg Glu Phe Glu Lys Tyr Leu Leu Ser Arg Lys His His AsnAsn 770 775 780 Tyr Lys Ser Ala Thr His Ser Asn Phe Glu Leu Val Lys ArgTyr Asn 785 790 795 800 Leu Glu Asn Thr Glu Lys Ile Glu Ile Gly Glu ArgTyr Tyr Tyr Ile 805 810 815 Tyr Ile Cys Asp Ile Ser Leu Pro Trp Gln LysLys Leu Cys Asn Ile 820 825 830 Leu Ser Tyr Glu Val Ile Ala Asp Ser LysPhe Tyr Leu Pro Lys Asp 835 840 845 Lys Arg Ile Phe Tyr Glu Ile Tyr PheLys Arg Ile Ala Ser Glu Val 850 855 860 Val Asn Leu Leu Thr Asp Lys ThrGln Cys 865 870 6 738 PRT Bos taurus (Bovine) 6 Pro Ser Phe Ala Pro TyrGlu Ala Asn Val Asp Phe Glu Ile Arg Phe 1 5 10 15 Met Val Asp Thr AspIle Val Gly Cys Asn Trp Leu Glu Leu Pro Ala 20 25 30 Gly Lys Tyr Ile LeuArg Pro Glu Gly Lys Ala Thr Leu Cys Gln Leu 35 40 45 Glu Ala Asp Val LeuTrp Ser Asp Val Ile Ser His Pro Pro Glu Gly 50 55 60 Glu Trp Gln Arg IleAla Pro Leu Arg Val Leu Ser Phe Asp Ile Glu 65 70 75 80 Cys Ala Gly ArgLys Gly Ile Phe Pro Glu Pro Glu Arg Asp Pro Val 85 90 95 Ile Gln Ile CysSer Leu Gly Leu Arg Trp Gly Glu Pro Glu Pro Phe 100 105 110 Leu Arg LeuAla Leu Thr Leu Arg Pro Cys Ala Pro Ile Leu Gly Ala 115 120 125 Lys ValGln Ser Tyr Glu Arg Glu Glu Asp Leu Leu Gln Ala Trp Ser 130 135 140 ThrPhe Ile Arg Ile Met Asp Pro Asp Val Ile Thr Gly Tyr Asn Ile 145 150 155160 Gln Asn Phe Asp Leu Pro Tyr Leu Ile Ser Arg Ala Gln Thr Leu Lys 165170 175 Val Pro Gly Phe Pro Leu Leu Gly Arg Val Ile Gly Leu Arg Ser Asn180 185 190 Ile Arg Glu Ser Ser Phe Gln Ser Arg Gln Thr Gly Arg Arg AspSer 195 200 205 Lys Val Val Ser Met Val Gly Arg Val Gln Met Asp Met LeuGln Val 210 215 220 Leu Leu Arg Glu Tyr Lys Leu Arg Ser Tyr Thr Leu AsnAla Val Ser 225 230 235 240 Phe His Phe Leu Gly Glu Gln Lys Glu Asp ValGln His Ser Ile Ile 245 250 255 Thr Asp Leu Gln Asn Gly Asn Asp Gln ThrArg Arg Arg Leu Ala Val 260 265 270 Tyr Cys Leu Lys Asp Ala Phe Leu ProLeu Arg Leu Leu Glu Arg Leu 275 280 285 Met Val Leu Val Asn Ala Met GluMet Ala Arg Val Thr Gly Val Pro 290 295 300 Leu Gly Tyr Leu Leu Ser ArgGly Gln Gln Val Lys Val Val Ser Gln 305 310 315 320 Leu Leu Arg Gln AlaMet Arg Gln Gly Leu Leu Met Pro Val Val Lys 325 330 335 Thr Glu Gly GlyGlu Asp Tyr Thr Gly Ala Thr Val Ile Glu Pro Leu 340 345 350 Lys Gly TyrTyr Asp Val Pro Ile Ala Thr Leu Asp Phe Ser Ser Leu 355 360 365 Tyr ProSer Ile Met Met Ala His Asn Leu Cys Tyr Thr Thr Leu Leu 370 375 380 ArgPro Gly Ala Ala Gln Lys Leu Gly Leu Thr Glu Asp Gln Phe Ile 385 390 395400 Lys Thr Pro Thr Gly Asp Glu Phe Val Lys Ala Ser Val Arg Lys Gly 405410 415 Leu Leu Pro Gln Ile Leu Glu Asn Leu Leu Ser Ala Arg Lys Arg Ala420 425 430 Lys Ala Glu Leu Ala Lys Glu Thr Asp Pro Leu Arg Arg Gln ValLeu 435 440 445 Asp Gly Arg Gln Leu Ala Leu Lys Val Ser Ala Asn Ser ValTyr Gly 450 455 460 Phe Thr Gly Ala Gln Val Gly Arg Leu Pro Cys Leu GluIle Ser Gln 465 470 475 480 Ser Val Thr Gly Phe Gly Arg Gln Met Ile GluLys Thr Lys Gln Leu 485 490 495 Val Glu Thr Lys Tyr Thr Val Glu Asn GlyTyr Ser Thr Ser Ala Lys 500 505 510 Val Val Tyr Gly Asp Thr Asp Ser ValMet Cys Arg Phe Gly Val Ser 515 520 525 Ser Val Ala Glu Ala Met Ala LeuGly Arg Glu Ala Ala Asp Trp Val 530 535 540 Ser Gly His Phe Pro Ser ProIle Arg Leu Glu Phe Glu Lys Val Tyr 545 550 555 560 Phe Pro Tyr Leu LeuIle Ser Lys Lys Arg Tyr Ala Gly Leu Leu Phe 565 570 575 Ser Ser Arg ProAsp Ala His Asp Arg Met Asp Cys Lys Gly Leu Glu 580 585 590 Ala Val ArgArg Asp Asn Cys Pro Leu Val Ala Asn Leu Val Thr Ala 595 600 605 Ser LeuArg Arg Leu Leu Ile Asp Arg Asp Pro Ser Gly Ala Val Ala 610 615 620 HisAla Gln Asp Val Ile Ser Asp Leu Leu Cys Asn Arg Ile Asp Ile 625 630 635640 Ser Gln Leu Val Ile Thr Lys Glu Leu Thr Arg Ala Ala Ala Asp Tyr 645650 655 Ala Gly Lys Gln Ala His Val Glu Leu Ala Glu Arg Met Arg Lys Arg660 665 670 Asp Pro Gly Ser Ala Pro Ser Leu Gly Asp Arg Val Pro Tyr ValIle 675 680 685 Ile Ser Ala Ala Lys Gly Val Ala Ala Tyr Met Lys Ser GluAsp Pro 690 695 700 Leu Phe Val Leu Glu His Ser Leu Pro Ile Asp Thr GlnTyr Tyr Leu 705 710 715 720 Glu Gln Gln Leu Ala Lys Pro Leu Leu Arg IlePhe Glu Pro Ile Leu 725 730 735 Gly Glu 7 738 PRT Homo sapiens 7 Pro SerPhe Ala Pro Tyr Glu Ala Asn Val Asp Phe Glu Ile Arg Phe 1 5 10 15 MetVal Asp Thr Asp Ile Val Gly Cys Asn Trp Leu Glu Leu Pro Ala 20 25 30 GlyLys Tyr Ala Leu Arg Leu Lys Glu Lys Ala Thr Gln Cys Gln Leu 35 40 45 GluAla Asp Val Leu Trp Ser Asp Val Val Ser His Pro Pro Glu Gly 50 55 60 ProTrp Gln Arg Ile Ala Pro Leu Arg Val Leu Ser Phe Asp Ile Glu 65 70 75 80Cys Ala Gly Arg Lys Gly Ile Phe Pro Glu Pro Glu Arg Asp Pro Val 85 90 95Ile Gln Ile Cys Ser Leu Gly Leu Arg Trp Gly Glu Pro Glu Pro Phe 100 105110 Leu Arg Leu Ala Leu Thr Leu Arg Pro Cys Ala Pro Ile Leu Gly Ala 115120 125 Lys Val Gln Ser Tyr Glu Lys Glu Glu Asp Leu Leu Gln Ala Trp Ser130 135 140 Thr Phe Ile Arg Ile Met Asp Pro Asp Val Ile Thr Gly Tyr AsnIle 145 150 155 160 Gln Asn Phe Asp Leu Pro Tyr Leu Ile Ser Arg Ala GlnThr Leu Lys 165 170 175 Val Gln Thr Phe Pro Phe Leu Gly Arg Val Ala GlyLeu Cys Ser Asn 180 185 190 Ile Arg Asp Ser Ser Phe Gln Ser Lys Gln ThrGly Arg Arg Asp Thr 195 200 205 Lys Val Val Ser Met Val Gly Arg Val GlnMet Asp Met Leu Gln Val 210 215 220 Leu Leu Arg Glu Tyr Lys Leu Arg SerHis Thr Leu Asn Ala Val Ser 225 230 235 240 Phe His Phe Leu Gly Glu GlnLys Glu Asp Val Gln His Ser Ile Ile 245 250 255 Thr Asp Leu Gln Asn GlyAsn Asp Gln Thr Arg Arg Arg Leu Ala Val 260 265 270 Tyr Cys Leu Lys AspAla Tyr Leu Pro Leu Arg Leu Leu Glu Arg Leu 275 280 285 Met Val Leu ValAsn Ala Val Glu Met Ala Arg Val Thr Gly Val Pro 290 295 300 Leu Ser TyrLeu Leu Ser Arg Gly Gln Gln Val Lys Val Val Ser Gln 305 310 315 320 LeuLeu Arg Gln Ala Met His Glu Gly Leu Leu Met Pro Val Val Lys 325 330 335Ser Glu Gly Gly Glu Asp Tyr Thr Gly Ala Thr Val Ile Glu Pro Leu 340 345350 Lys Gly Tyr Tyr Asp Val Pro Ile Ala Thr Leu Asp Phe Ser Ser Leu 355360 365 Tyr Pro Ser Ile Met Met Ala His Asn Leu Cys Tyr Thr Thr Leu Leu370 375 380 Arg Pro Gly Thr Ala Gln Lys Leu Gly Leu Thr Glu Asp Gln PheIle 385 390 395 400 Arg Thr Pro Thr Gly Asp Glu Phe Val Lys Thr Ser ValArg Lys Gly 405 410 415 Leu Leu Pro Gln Ile Leu Glu Asn Leu Leu Ser AlaArg Lys Arg Ala 420 425 430 Lys Ala Glu Leu Ala Lys Glu Thr Asp Pro LeuArg Arg Gln Val Leu 435 440 445 Asp Gly Arg Gln Leu Ala Leu Lys Val SerAla Asn Ser Val Tyr Gly 450 455 460 Phe Thr Gly Ala Gln Val Gly Lys LeuPro Cys Leu Glu Ile Ser Gln 465 470 475 480 Ser Val Thr Gly Phe Gly ArgGln Met Ile Glu Lys Thr Lys Gln Leu 485 490 495 Val Glu Ser Lys Tyr ThrVal Glu Asn Gly Tyr Ser Thr Ser Ala Lys 500 505 510 Val Val Tyr Gly AspThr Asp Ser Val Met Cys Arg Phe Gly Val Ser 515 520 525 Ser Val Ala GluAla Met Ala Leu Gly Arg Glu Ala Ala Asp Trp Val 530 535 540 Ser Gly HisPhe Pro Ser Pro Ile Arg Leu Glu Phe Glu Lys Val Tyr 545 550 555 560 PhePro Tyr Leu Leu Ile Ser Lys Lys Arg Tyr Ala Gly Leu Leu Phe 565 570 575Ser Ser Arg Pro Asp Ala His Asp Arg Met Asp Cys Lys Gly Leu Glu 580 585590 Ala Val Arg Arg Asp Asn Cys Pro Leu Val Ala Asn Leu Val Thr Ala 595600 605 Ser Leu Arg Arg Leu Leu Ile Asp Arg Asp Pro Glu Gly Ala Val Ala610 615 620 His Ala Gln Asp Val Ile Ser Asp Leu Leu Cys Asn Arg Ile AspIle 625 630 635 640 Ser Gln Leu Val Ile Thr Lys Glu Leu Thr Arg Ala AlaSer Asp Tyr 645 650 655 Ala Gly Lys Gln Ala His Val Glu Leu Ala Glu ArgMet Arg Lys Arg 660 665 670 Asp Pro Gly Ser Ala Pro Ser Leu Gly Asp ArgVal Pro Tyr Val Ile 675 680 685 Ile Ser Ala Ala Lys Gly Val Ala Ala TyrMet Lys Ser Glu Asp Pro 690 695 700 Leu Phe Val Leu Glu His Ser Leu ProIle Asp Thr Gln Tyr Tyr Leu 705 710 715 720 Glu Gln Gln Leu Ala Lys ProLeu Leu Arg Ile Phe Glu Pro Ile Leu 725 730 735 Gly Glu 8 734 PRTCandida albicans (Yeast) 8 Ile Asp Pro Cys Ile Thr Tyr Asp Asn Ile AsnTyr Leu Leu Arg Leu 1 5 10 15 Met Ile Asp Cys Lys Ile Thr Gly Met SerTrp Ile Thr Leu Pro Arg 20 25 30 Asp Lys Tyr Lys Ile Val Asn Asn Lys IleSer Thr Cys Gln Ile Glu 35 40 45 Cys Ser Ile Asp Tyr Arg Asp Leu Ile SerHis Pro Pro Glu Gly Glu 50 55 60 Trp Leu Lys Met Ala Pro Leu Arg Ile LeuSer Phe Asp Ile Glu Cys 65 70 75 80 Ala Gly Arg Lys Gly Val Phe Pro GluAla Glu His Asp Pro Val Ile 85 90 95 Gln Ile Ala Asn Val Val Gln Lys SerGly Glu Ser Lys Pro Phe Val 100 105 110 Arg Asn Val Phe Thr Val Asn ThrCys Ser Ser Ile Ile Gly Ser Gln 115 120 125 Ile Phe Glu His Gln Arg GluGlu Asp Met Leu Met His Trp Lys Glu 130 135 140 Phe Ile Thr Lys Val AspPro Asp Val Ile Ile Gly Tyr Asn Thr Ala 145 150 155 160 Asn Phe Asp IlePro Tyr Val Leu Asn Arg Ala Lys Ala Leu Gly Leu 165 170 175 Asn Asp PhePro Phe Phe Gly Arg Leu Lys Arg Val Lys Gln Glu Ile 180 185 190 Lys AspAla Val Phe Ser Ser Arg Ala Tyr Gly Thr Arg Glu Asn Lys 195 200 205 ValVal Asn Ile Asp Gly Arg Met Gln Leu Asp Leu Leu Gln Phe Ile 210 215 220Gln Arg Glu Tyr Lys Leu Arg Ser Tyr Thr Leu Asn Ser Val Ser Ala 225 230235 240 His Phe Leu Gly Glu Gln Lys Glu Asp Val Gln His Ser Ile Ile Thr245 250 255 Asp Leu Gln Asn Gly Thr Lys Glu Thr Arg Arg Arg Leu Ala ValTyr 260 265 270 Cys Leu Lys Asp Ala Phe Leu Pro Leu Arg Leu Leu Asp LysLeu Met 275 280 285 Cys Leu Val Asn Tyr Thr Glu Met Ala Arg Val Thr GlyVal Pro Phe 290 295 300 Ser Tyr Leu Leu Ser Arg Gly Gln Gln Ile Lys ValIle Ser Gln Leu 305 310 315 320 Phe Arg Lys Cys Leu Gln Glu Asp Ile ValIle Pro Asn Leu Lys Ser 325 330 335 Glu Gly Ser Asn Glu Glu Tyr Glu GlyAla Thr Val Ile Glu Pro Glu 340 345 350 Arg Gly Tyr Tyr Asp Val Pro IleAla Thr Leu Asp Phe Ser Ser Leu 355 360 365 Tyr Pro Ser Ile Met Met AlaHis Asn Leu Cys Tyr Thr Thr Leu Leu 370 375 380 Asn Lys Asn Ser Ile LysAla Phe Gly Leu Thr Glu Asp Asp Tyr Thr 385 390 395 400 Lys Thr Pro AsnGly Asp Tyr Phe Val His Ser Asn Leu Arg Lys Gly 405 410 415 Ile Leu ProThr Ile Leu Asp Glu Leu Leu Thr Ala Arg Lys Lys Ala 420 425 430 Lys AlaAsp Leu Lys Lys Glu Thr Asp Pro Phe Lys Lys Asp Val Leu 435 440 445 AsnGly Arg Gln Leu Ala Leu Lys Ile Ser Ala Asn Ser Val Tyr Gly 450 455 460Phe Thr Gly Ala Thr Val Gly Lys Leu Pro Cys Leu Ala Ile Ser Ser 465 470475 480 Ser Val Thr Ala Phe Gly Arg Glu Met Ile Glu Lys Thr Lys Asn Glu485 490 495 Val Gln Glu Tyr Tyr Ser Lys Lys Asn Gly His Pro Tyr Asp AlaLys 500 505 510 Val Ile Tyr Gly Asp Thr Asp Ser Val Met Val Lys Phe GlyTyr Gln 515 520 525 Asp Leu Glu Thr Cys Met Lys Leu Gly Glu Glu Ala AlaAsn Tyr Val 530 535 540 Ser Thr Lys Phe Lys Asn Pro Ile Lys Leu Glu PheGlu Lys Val Tyr 545 550 555 560 Phe Pro Tyr Leu Leu Ile Asn Lys Lys ArgTyr Ala Gly Leu Tyr Trp 565 570 575 Thr Arg Pro Glu Lys Phe Asp Lys MetAsp Thr Lys Gly Ile Glu Thr 580 585 590 Val Arg Arg Asp Asn Cys Gln LeuVal Gln Asn Val Ile Thr Lys Val 595 600 605 Leu Glu Phe Ile Leu Glu GluArg Asp Val Pro Lys Ala Gln Arg Phe 610 615 620 Val Lys Gln Thr Ile AlaAsp Leu Leu Gln Asn Arg Ile Asp Leu Ser 625 630 635 640 Gln Leu Val IleThr Lys Ala Tyr Ser Lys His Asp Tyr Ser Ala Lys 645 650 655 Gln Ala HisVal Glu Leu Ala Glu Arg Met Arg Lys Arg Asp Pro Gly 660 665 670 Ser AlaPro Thr Leu Gly Asp Arg Val Ala Tyr Val Ile Ile Lys Thr 675 680 685 GlyGly Asp Lys Asn Tyr Glu Lys Ser Glu Asp Pro Leu Tyr Val Leu 690 695 700Glu Asn Ser Leu Pro Ile Asp Val Lys Tyr Tyr Leu Asp Gln Gln Leu 705 710715 720 Thr Lys Pro Leu Glu Arg Ile Phe Ile Pro Ile Leu Gly Glu 725 7309 734 PRT Saccharomyces cerevisiae 9 Ser Asn Gly Thr Thr Thr Tyr Asp AsnIle Ala Tyr Thr Leu Arg Leu 1 5 10 15 Met Val Asp Cys Gly Ile Val GlyMet Ser Trp Ile Thr Leu Pro Lys 20 25 30 Gly Lys Tyr Ser Met Ile Glu ProAsn Asn Arg Val Ser Ser Cys Gln 35 40 45 Leu Glu Val Ser Ile Asn Tyr ArgAsn Leu Ile Ala His Pro Ala Glu 50 55 60 Gly Asp Trp Ser His Thr Ala ProLeu Arg Ile Met Ser Phe Asp Ile 65 70 75 80 Glu Cys Ala Gly Arg Ile GlyVal Phe Pro Glu Pro Glu Tyr Asp Pro 85 90 95 Val Ile Gln Ile Ala Asn ValVal Ser Ile Ala Gly Ala Lys Lys Pro 100 105 110 Phe Ile Arg Asn Val PheThr Leu Asn Thr Cys Ser Pro Ile Thr Gly 115 120 125 Ser Met Ile Phe SerHis Ala Thr Glu Glu Glu Met Leu Ser Asn Trp 130 135 140 Arg Asn Phe IleIle Lys Val Asp Pro Asp Val Ile Ile Gly Tyr Asn 145 150 155 160 Thr ThrAsn Phe Asp Ile Pro Tyr Leu Leu Asn Arg Ala Lys Ala Leu 165 170 175 LysVal Asn Asp Phe Pro Tyr Phe Gly Arg Leu Lys Thr Val Lys Gln 180 185 190Glu Ile Lys Glu Ser Val Phe Ser Ser Lys Ala Tyr Gly Thr Arg Glu 195 200205 Thr Lys Asn Val Asn Ile Asp Gly Arg Leu Gln Leu Asp Leu Leu Gln 210215 220 Phe Ile Gln Arg Glu Tyr Lys Leu Arg Ser Tyr Thr Leu Asn Ala Val225 230 235 240 Ser Ala His Phe Leu Gly Glu Gln Lys Glu Asp Val His TyrSer Ile 245 250 255 Ile Ser Asp Leu Gln Asn Gly Asp Ser Glu Thr Arg ArgArg Leu Ala 260 265 270 Val Tyr Cys Leu Lys Asp Ala Tyr Leu Pro Leu ArgLeu Met Glu Lys 275 280 285 Leu Met Ala Leu Val Asn Tyr Thr Glu Met AlaArg Val Thr Gly Val 290 295 300 Pro Phe Ser Tyr Leu Leu Ala Arg Gly GlnGln Ile Lys Val Val Ser 305 310 315 320 Gln Leu Phe Arg Lys Cys Leu GluIle Asp Thr Val Ile Pro Asn Met 325 330 335 Gln Ser Gln Ala Ser Asp AspGln Tyr Glu Gly Ala Thr Val Ile Glu 340 345 350 Pro Ile Arg Gly Tyr TyrAsp Val Pro Ile Ala Thr Leu Asp Phe Asn 355 360 365 Ser Leu Tyr Pro SerIle Met Met Ala His Asn Leu Cys Tyr Thr Thr 370 375 380 Leu Cys Asn LysAla Thr Val Glu Arg Leu Asn Leu Lys Ile Asp Glu 385 390 395 400 Asp TyrVal Ile Thr Pro Asn Gly Asp Tyr Phe Val Thr Thr Lys Arg 405 410 415 ArgArg Gly Ile Leu Pro Ile Ile Leu Asp Glu Leu Ile Ser Ala Arg 420 425 430Lys Arg Ala Lys Lys Asp Leu Arg Asp Glu Lys Asp Pro Phe Lys Arg 435 440445 Asp Val Leu Asn Gly Arg Gln Leu Ala Leu Lys Ile Ser Ala Asn Ser 450455 460 Val Tyr Gly Phe Thr Gly Ala Thr Val Gly Lys Leu Pro Cys Leu Ala465 470 475 480 Ile Ser Ser Ser Val Thr Ala Tyr Gly Arg Thr Met Ile LeuLys Thr 485 490 495 Lys Thr Ala Val Gln Glu Lys Tyr Cys Ile Lys Asn GlyTyr Lys His 500 505 510 Asp Ala Val Val Val Tyr Gly Asp Thr Asp Ser ValMet Val Lys Phe 515 520 525 Gly Thr Thr Asp Leu Lys Glu Ala Met Asp LeuGly Thr Glu Ala Ala 530 535 540 Lys Tyr Val Ser Thr Leu Phe Lys His ProIle Asn Leu Glu Phe Glu 545 550 555 560 Lys Ala Tyr Phe Pro Tyr Leu LeuIle Asn Lys Lys Arg Tyr Ala Gly 565 570 575 Leu Phe Trp Thr Asn Pro AspLys Phe Asp Lys Leu Asp Gln Lys Gly 580 585 590 Leu Ala Ser Val Arg ArgAsp Ser Cys Ser Leu Val Ser Ile Val Met 595 600 605 Asn Lys Val Leu LysLys Ile Leu Ile Glu Arg Asn Val Asp Gly Ala 610 615 620 Leu Ala Phe ValArg Glu Thr Ile Asn Asp Ile Leu His Asn Arg Val 625 630 635 640 Asp IleSer Lys Leu Ile Ile Ser Lys Thr Leu Ala Pro Asn Tyr Thr 645 650 655 AsnPro Gln Pro His Ala Val Leu Ala Glu Arg Met Lys Arg Arg Glu 660 665 670Gly Val Gly Pro Asn Val Gly Asp Arg Val Asp Tyr Val Ile Ile Gly 675 680685 Gly Asn Asp Lys Leu Tyr Asn Arg Ala Glu Asp Pro Leu Phe Val Leu 690695 700 Glu Asn Asn Ile Gln Val Asp Ser Arg Tyr Tyr Leu Thr Asn Gln Leu705 710 715 720 Gln Asn Pro Ile Ile Ser Ile Val Ala Pro Ile Ile Gly Asp725 730 10 735 PRT Schizosaccharomyces pombe 10 Val Gly Val Thr Thr PheGlu Ser Asn Thr Gln Tyr Leu Leu Arg Phe 1 5 10 15 Met Ile Asp Cys AspVal Val Gly Met Asn Trp Ile His Leu Pro Ala 20 25 30 Ser Lys Tyr Gln PheArg Tyr Gln Asn Arg Val Ser Asn Cys Gln Ile 35 40 45 Glu Ala Trp Ile AsnTyr Lys Asp Leu Ile Ser Leu Pro Ala Glu Gly 50 55 60 Gln Trp Ser Lys MetAla Pro Leu Arg Ile Met Ser Phe Asp Ile Glu 65 70 75 80 Cys Ala Gly ArgLys Gly Val Phe Pro Asp Pro Ser Ile Asp Pro Val 85 90 95 Ile Gln Ile AlaSer Ile Val Thr Gln Tyr Gly Asp Ser Thr Pro Phe 100 105 110 Val Arg AsnVal Phe Cys Val Asp Thr Cys Ser Gln Ile Val Gly Thr 115 120 125 Gln ValTyr Glu Phe Gln Asn Gln Ala Glu Met Leu Ser Ser Trp Ser 130 135 140 LysPhe Val Arg Asp Val Asp Pro Asp Val Leu Ile Gly Tyr Asn Ile 145 150 155160 Cys Asn Phe Asp Ile Pro Tyr Leu Leu Asp Arg Ala Lys Ser Leu Arg 165170 175 Ile His Asn Phe Pro Leu Leu Gly Arg Ile His Asn Phe Phe Ser Val180 185 190 Ala Lys Glu Thr Ser Phe Ser Ser Lys Ala Tyr Gly Thr Arg GluSer 195 200 205 Lys Thr Thr Ser Ile Pro Gly Arg Leu Gln Leu Asp Met LeuGln Val 210 215 220 Met Gln Arg Asp Phe Lys Leu Arg Ser Tyr Ser Leu AsnAla Val Cys 225 230 235 240 Ser Gln Phe Leu Gly Glu Gln Lys Glu Asp ValHis Tyr Ser Ile Ile 245 250 255 Thr Asp Leu Gln Asn Gly Thr Ala Asp SerArg Arg Arg Leu Ala Ile 260 265 270 Tyr Cys Leu Lys Asp Ala Tyr Leu ProGln Arg Leu Met Asp Lys Leu 275 280 285 Met Cys Phe Val Asn Tyr Thr GluMet Ala Arg Val Thr Gly Val Pro 290 295 300 Phe Asn Phe Leu Leu Ala ArgGly Gln Gln Ile Lys Val Ile Ser Gln 305 310 315 320 Leu Phe Cys Lys AlaLeu Gln His Asp Leu Val Val Pro Asn Ile Arg 325 330 335 Val Asn Gly ThrAsp Glu Gln Tyr Glu Gly Ala Thr Val Ile Glu Pro 340 345 350 Ile Lys GlyTyr Tyr Asp Thr Pro Ile Ala Thr Leu Asp Phe Ser Ser 355 360 365 Leu TyrPro Ser Ile Met Gln Ala His Asn Leu Cys Tyr Thr Thr Leu 370 375 380 LeuAsp Ser Asn Thr Ala Glu Leu Leu Lys Leu Lys Gln Asp Val Asp 385 390 395400 Tyr Ser Val Thr Pro Asn Gly Asp Tyr Phe Val Lys Pro His Val Arg 405410 415 Lys Gly Leu Leu Pro Ile Ile Leu Ala Asp Leu Leu Asn Ala Arg Lys420 425 430 Lys Ala Lys Ala Asp Leu Lys Lys Glu Thr Asp Pro Phe Lys LysAla 435 440 445 Val Leu Asp Gly Arg Gln Leu Ala Leu Lys Val Ser Ala AsnSer Val 450 455 460 Tyr Gly Phe Thr Gly Ala Thr Asn Gly Arg Leu Pro CysLeu Ala Ile 465 470 475 480 Ser Ser Ser Val Thr Ser Tyr Gly Arg Gln MetIle Glu Lys Thr Lys 485 490 495 Asp Val Val Glu Lys Arg Tyr Arg Ile GluAsn Gly Tyr Ser His Asp 500 505 510 Ala Val Val Ile Tyr Gly Asp Thr AspSer Val Met Val Lys Phe Gly 515 520 525 Val Lys Thr Leu Pro Glu Ala MetLys Leu Gly Glu Glu Ala Ala Asn 530 535 540 Tyr Val Ser Asp Gln Phe ProAsn Pro Ile Asn Trp Ser Phe Ser Thr 545 550 555 560 Phe Pro Tyr Leu LeuIle Ser Lys Lys Arg Tyr Ala Gly Leu Phe Trp 565 570 575 Thr Arg Thr AspThr Tyr Asp Lys Met Asp Ser Lys Gly Ile Glu Thr 580 585 590 Val Arg ArgAsp Asn Cys Pro Leu Val Ser Tyr Val Ile Asp Thr Ala 595 600 605 Leu ArgLys Met Leu Ile Asp Gln Asp Val Glu Gly Ala Gln Leu Phe 610 615 620 ThrLys Lys Val Ile Ser Asp Leu Leu Gln Asn Lys Ile Asp Met Ser 625 630 635640 Gln His Val Ile Thr Lys Ala Leu Ser Lys Thr Asp Tyr Ala Ala Lys 645650 655 Met Ala His Val Glu Leu Ala Glu Arg Met Arg Lys Arg Asp Ala Gly660 665 670 Ser Ala Pro Ala Ile Gly Asp Arg Val Ala Tyr Val Ile Ile LysGly 675 680 685 Ala Gln Gly Asp Gln Phe Tyr Met Arg Ser Glu Asp Pro IleTyr Val 690 695 700 Leu Glu Asn Asn Ile Pro Ile Asp Ala Lys Tyr Tyr LeuGlu Asn Gln 705 710 715 720 Leu Ser Lys Pro Leu Leu Arg Ile Phe Glu ProIle Leu Gly Glu 725 730 735 11 741 PRT Plasmodium falciparum 11 Ile GlyGly Ile Val Tyr Glu Ala Asn Leu Pro Phe Ile Leu Arg Tyr 1 5 10 15 IleIle Asp His Lys Ile Thr Gly Ser Ser Trp Ile Asn Cys Lys Lys 20 25 30 GlyHis Tyr Tyr Ile Arg Asn Lys Asn Lys Lys Ile Ser Asn Cys Thr 35 40 45 PheGlu Ile Asp Ile Ser Tyr Glu His Val Glu Pro Ile Thr Leu Glu 50 55 60 AsnGlu Tyr Gln Gln Ile Pro Lys Leu Arg Ile Leu Ser Phe Asp Ile 65 70 75 80Glu Cys Ile Lys Leu Asp Gly Lys Gly Phe Pro Glu Ala Lys Asn Asp 85 90 95Pro Ile Ile Gln Ile Ser Ser Ile Leu Tyr Phe Gln Gly Glu Pro Ile 100 105110 Asp Asn Cys Thr Lys Phe Ile Phe Thr Leu Leu Glu Cys Ala Ser Ile 115120 125 Pro Gly Ser Asn Val Ile Trp Phe Asn Asp Glu Lys Thr Leu Leu Glu130 135 140 Ala Trp Asn Glu Phe Ile Ile Arg Ile Asp Pro Asp Phe Leu ThrGly 145 150 155 160 Tyr Asn Ile Ile Asn Phe Asp Leu Pro Tyr Ile Leu AsnArg Gly Thr 165 170 175 Ala Leu Asn Leu Lys Lys Leu Lys Phe Leu Gly ArgIle Lys Asn Val 180 185 190 Ala Ser Thr Val Lys Asp Ser Ser Phe Ser SerLys Gln Phe Gly Thr 195 200 205 His Glu Thr Lys Glu Ile Asn Ile Phe GlyArg Ile Gln Phe Asp Val 210 215 220 Tyr Asp Leu Ile Lys Arg Asp Tyr LysLeu Lys Ser Tyr Thr Leu Asn 225 230 235 240 Tyr Val Ser Phe Glu Phe LeuLys Glu Gln Lys Glu Asp Val His Tyr 245 250 255 Ser Ile Met Asn Asp LeuGln Asn Glu Ser Pro Glu Ser Arg Lys Arg 260 265 270 Ile Ala Thr Tyr CysIle Lys Asp Gly Val Leu Pro Leu Arg Leu Ile 275 280 285 Asp Lys Leu LeuPhe Ile Tyr Asn Tyr Val Glu Met Ala Arg Val Thr 290 295 300 Gly Thr ProPhe Val Tyr Leu Leu Thr Arg Gly Gln Gln Ile Lys Val 305 310 315 320 ThrSer Gln Leu Tyr Arg Lys Cys Lys Glu Leu Asn Tyr Val Ile Pro 325 330 335Ser Thr Tyr Met Lys Val Asn Thr Asn Glu Lys Tyr Glu Gly Ala Thr 340 345350 Val Leu Glu Pro Ile Lys Gly Tyr Tyr Ile Glu Pro Ile Ser Thr Leu 355360 365 Asp Phe Ala Ser Leu Tyr Pro Ser Ile Met Ile Ala His Asn Leu Cys370 375 380 Tyr Ser Thr Leu Ile Lys Ser Asn His Glu Val Ser Asp Leu GlnAsn 385 390 395 400 Asp Asp Ile Thr Thr Ile Gln Gly Lys Asn Asn Leu LysPhe Val Lys 405 410 415 Lys Asn Val Lys Lys Gly Ile Leu Pro Leu Ile ValGlu Glu Leu Ile 420 425 430 Glu Ala Arg Lys Lys Val Lys Leu Leu Ile LysAsn Glu Lys Asn Asn 435 440 445 Ile Thr Lys Met Val Leu Asn Gly Arg GlnLeu Ala Leu Lys Ile Ser 450 455 460 Ala Asn Ser Val Tyr Gly Tyr Thr GlyAla Ser Ser Gly Gly Gln Leu 465 470 475 480 Pro Cys Leu Glu Val Ala ValSer Ile Thr Thr Leu Gly Arg Ser Met 485 490 495 Ile Glu Lys Thr Lys GluArg Val Glu Ser Phe Tyr Cys Lys Ser Asn 500 505 510 Gly Tyr Glu His AsnSer Thr Val Ile Tyr Gly Asp Thr Asp Ser Val 515 520 525 Met Val Lys PheGly Thr Asn Asn Ile Glu Glu Ala Met Thr Leu Gly 530 535 540 Lys Asp AlaAla Glu Arg Ile Ser Lys Glu Phe Leu Ser Pro Ile Lys 545 550 555 560 LeuGlu Phe Glu Lys Val Tyr Cys Pro Tyr Leu Leu Leu Asn Lys Lys 565 570 575Arg Tyr Ala Gly Leu Leu Tyr Thr Asn Pro Asn Lys His Asp Lys Met 580 585590 Asp Cys Lys Gly Ile Glu Thr Val Arg Arg Asp Phe Cys Ile Leu Ile 595600 605 Gln Gln Met Met Glu Thr Val Leu Asn Lys Leu Leu Ile Glu Lys Asn610 615 620 Leu Asn Ser Ala Ile Glu Tyr Thr Lys Ser Lys Ile Lys Glu LeuLeu 625 630 635 640 Thr Asn Asn Ile Asp Met Ser Leu Leu Val Val Thr LysSer Leu Gly 645 650 655 Lys Thr Asp Tyr Glu Thr Arg Leu Pro His Val GluLeu Ala Lys Lys 660 665 670 Leu Lys Gln Arg Asp Ser Ala Thr Ala Pro AsnVal Gly Asp Arg Val 675 680 685 Ser Tyr Ile Ile Val Lys Gly Val Lys GlyGln Ala Gln Tyr Glu Arg 690 695 700 Ala Glu Asp Pro Leu Tyr Val Leu AspAsn Asn Leu Ala Ile Asp Tyr 705 710 715 720 Asn His Tyr Leu Asp Ala IleLys Ser Pro Leu Ser Arg Ile Phe Glu 725 730 735 Val Ile Met Gln Asn 74012 744 PRT Chlorella virus NY-2A 12 Glu Tyr Gln Ile Tyr Glu Ser Ser ValAsp Pro Ile Ile Arg Ile Phe 1 5 10 15 His Leu Arg Asn Ile Asn Pro AlaAsp Trp Met His Val Ser Lys Ala 20 25 30 Phe Pro Val Glu Thr Arg Ile SerAsn Ser Asp Ile Glu Val Glu Thr 35 40 45 Ser Phe Gln His Leu Gly Pro SerAsp Leu Lys Glu Val Pro Pro Leu 50 55 60 Ile Ile Ala Ser Trp Asp Ile GluThr Tyr Ser Lys Asp Arg Lys Phe 65 70 75 80 Pro Leu Ala Glu Asn Pro AlaAsp Tyr Cys Ile Gln Ile Ala Thr Thr 85 90 95 Phe Gln Lys Tyr Gly Glu ProGlu Pro Tyr Arg Arg Val Val Val Cys 100 105 110 Tyr Lys Gln Thr Ala SerVal Glu Gly Val Glu Ile Ile Ser Cys Ala 115 120 125 Glu Glu Ala Asp ValMet Asn Thr Trp Met Thr Ile Leu Gln Asp Glu 130 135 140 Ile Thr Asp ValSer Ile Gly Tyr Asn Leu Trp Gln Tyr Asp Leu Arg 145 150 155 160 Tyr IleHis Gly Arg Ser Met Met Cys Val Asp Asp Ile Thr Gly Glu 165 170 175 AspAsn Val Arg Leu Lys Asn Leu Gly Arg Leu Leu Val Gly Gly Gly 180 185 190Glu Val Ile Glu Arg Asp Leu Ser Ser Asn Ala Phe Gly Gln Asn Lys 195 200205 Phe Phe Leu Leu Asp Met Pro Gly Val Met Gln Ile Asp Leu Leu Gln 210215 220 Trp Phe Arg Lys Asn Arg Asn Leu Glu Ser Tyr Ser Leu Asn Asn Val225 230 235 240 Ser Lys Leu Tyr Leu Gly Asp Gln Lys Asn Asp Leu Pro AlaMet Gln 245 250 255 Ile Phe Glu Lys Phe Glu Gly Gly Ala Asp Asp Arg AlaIle Ile Ala 260 265 270 Ala Tyr Ala Arg Lys Asp Thr Asp Leu Pro Leu LysLeu Leu Lys Lys 275 280 285 Met Ala Ile Leu Glu Asp Ile Thr Glu Met AlaAsn Ala Val Lys Val 290 295 300 Pro Val Asp Tyr Ile Asn Phe Arg Gly GlnGln Val Arg Ala Phe Ser 305 310 315 320 Cys Leu Val Gly Lys Ala Arg GlnMet Asn Tyr Ala Ile Pro Asp Asp 325 330 335 Lys Met Trp Thr Val Asp GlyLys Tyr Glu Gly Ala Thr Val Leu Asp 340 345 350 Ala Lys Lys Gly Ala TyrPhe Thr Ser Ile Ala Ala Leu Asp Phe Ala 355 360 365 Ser Leu Tyr Pro SerIle Ile Arg Ala His Asn Met Ser Pro Glu Thr 370 375 380 Leu Val Met AspLys Arg Phe Glu Asn Leu Pro Gly Ile Glu Tyr Tyr 385 390 395 400 Glu IleGlu Thr Gly Leu Gly Thr Phe Lys Tyr Pro Gln Lys Asn Asp 405 410 415 GluThr Gly Glu Gly Gln Gly Val Val Pro Ala Leu Leu Asp Asp Leu 420 425 430Ala Lys Phe Arg Lys Gln Ala Lys Lys His Met Ala Glu Ala Lys Lys 435 440445 Asn Asp Asp Glu Phe Arg Glu Ala Leu Tyr Asp Ala Gln Gln Arg Ser 450455 460 Tyr Lys Ile Val Met Asn Ser Val Tyr Gly Phe Leu Gly Ala Ser Arg465 470 475 480 Gly Phe Ile Pro Cys Val Pro Ile Ala Ala Ser Val Thr AlaThr Gly 485 490 495 Arg Lys Met Ile Glu His Thr Ala Lys Arg Val Thr GluLeu Leu Pro 500 505 510 Gly Ser Glu Val Ile Tyr Gly Asp Thr Asp Ser ValMet Ile Arg Met 515 520 525 Lys Leu Pro Asp Asp Lys Ile His Asp Met AspGlu Gln Phe Lys Met 530 535 540 Ala Lys Trp Leu Ala Gly Glu Ile Thr LysAsp Phe Lys Ala Pro Asn 545 550 555 560 Asp Leu Glu Phe Glu Lys Ile TyrTyr Pro Tyr Ile Leu Tyr Ser Lys 565 570 575 Lys Arg Tyr Ala Ala Ile LysPhe Glu Asp Pro Asp Glu Lys Gly Lys 580 585 590 Val Asp Val Lys Gly LeuAla Leu Val Arg Arg Asp Phe Ser Pro Ile 595 600 605 Thr Arg Glu Ile LeuLys Glu Ser Leu Asp Thr Ile Leu Phe Lys Lys 610 615 620 Asp Thr Pro ThrAla Val Thr Glu Thr Val Glu Cys Ile Arg Lys Val 625 630 635 640 Leu AspAsn Glu Tyr Pro Met Glu Lys Phe Thr Met Ser Lys Thr Leu 645 650 655 LysThr Gly Tyr Lys Asn Glu Cys Gln Pro His Leu His Val Ser Asn 660 665 670Lys Ile Phe Glu Arg Thr Gly Phe Pro Val Pro Ser Gly Ala Arg Val 675 680685 Pro Phe Val Tyr Ile Glu Asp Lys Lys Asn Leu Asp Thr Lys Gln Ser 690695 700 Phe Arg Ala Glu Asp Pro Thr Phe Ala Gln Glu Asn Asp Leu Ile Val705 710 715 720 Asp Arg Leu Phe Tyr Ile Glu His Gln Leu Met Lys Pro IleCys Ser 725 730 735 Leu Phe Glu Pro Leu Leu Asp Asp 740 13 743 PRTParamecium bursaria chlorella virus 1 13 Tyr Gln Ile Tyr Glu Ser Ser ValAsp Pro Ile Ile Arg Val Phe His 1 5 10 15 Leu Arg Asn Ile Asn Pro AlaAsp Trp Ile Arg Val Ser Lys Ala Tyr 20 25 30 Pro Ala Gln Thr Arg Ile SerAsn Ser Asp Ile Glu Val Glu Thr Ser 35 40 45 Phe Gln His Leu Gly Pro ValGlu Asp Lys Thr Val Pro Pro Leu Val 50 55 60 Ile Ala Ser Trp Asp Ile GluThr Tyr Ser Lys Asp Arg Lys Phe Pro 65 70 75 80 Leu Ala Glu Asn Pro ThrAsp Tyr Cys Ile Gln Ile Ala Thr Thr Phe 85 90 95 Gln Lys Tyr Gly Glu ProGlu Pro Tyr Arg Arg Val Val Val Cys Tyr 100 105 110 Lys Gln Thr Ala ProVal Glu Gly Val Glu Ile Ile Ser Cys Leu Glu 115 120 125 Glu Ser Asp ValMet Asn Thr Trp Met Lys Ile Leu Gln Asp Glu Lys 130 135 140 Thr Asp ValSer Ile Gly Tyr Asn Thr Trp Gln Tyr Asp Leu Arg Tyr 145 150 155 160 ValHis Gly Arg Thr Gln Met Cys Val Asp Asp Met Thr Gly Glu Asp 165 170 175Lys Val Lys Leu Ser Asn Leu Gly Arg Leu Leu Ser Gly Gly Gly Glu 180 185190 Val Val Glu Arg Asp Leu Ser Ser Asn Ala Phe Gly Gln Asn Lys Phe 195200 205 Phe Leu Leu Asp Met Pro Gly Val Met Gln Ile Asp Leu Leu Gln Trp210 215 220 Phe Arg Lys Asn Arg Asn Leu Glu Ser Tyr Ser Leu Asn Asn ValSer 225 230 235 240 Lys Leu Tyr Leu Gly Asp Gln Lys Asn Asp Leu Pro AlaMet Gln Ile 245 250 255 Phe Glu Lys Phe Glu Gly Asn Ala Glu Asp Arg AlaIle Ile Ala Ala 260 265 270 Tyr Ala Ala Lys Asp Thr Asp Leu Pro Leu LysLeu Leu Lys Lys Met 275 280 285 Ala Ile Leu Glu Asp Leu Thr Glu Met AlaAsn Ala Val Lys Val Pro 290 295 300 Val Asp Tyr Ile Asn Phe Arg Gly GlnGln Ile Arg Ala Phe Ser Cys 305 310 315 320 Leu Val Gly Lys Ala Arg GlnMet Asn Tyr Ala Ile Pro Asp Asp Lys 325 330 335 Ala Trp Ala Thr Glu GlyLys Tyr Glu Gly Ala Thr Val Leu Asp Ala 340 345 350 Lys Lys Gly Ala TyrPhe Thr Pro Ile Ala Ala Leu Asp Phe Ala Ser 355 360 365 Leu Tyr Pro SerIle Ile Arg Ala His Asn Met Ser Pro Glu Thr Leu 370 375 380 Val Met GluLys Arg Phe Glu Asn Val Pro Gly Val Glu Tyr Tyr Glu 385 390 395 400 IleGlu Thr Gly Leu Gly Lys Phe Lys Tyr Ala Gln Lys Asn Asp Glu 405 410 415Thr Gly Glu Gly Gln Gly Val Val Pro Ala Leu Leu Asp Asp Leu Ala 420 425430 Lys Phe Arg Lys Leu Ala Lys Lys His Met Ala Glu Ala Lys Arg Asn 435440 445 Gly Asp Asp Phe Lys Glu Ala Leu Tyr Asp Ala Gln Gln Arg Ser Phe450 455 460 Lys Val Val Met Asn Ser Val Tyr Gly Phe Leu Gly Ala Ser LysGly 465 470 475 480 Phe Ile Pro Cys Val Pro Ile Ala Ala Ser Val Thr AlaThr Gly Arg 485 490 495 Lys Met Ile Glu His Thr Ala Lys Arg Ala Val GluLeu Leu Pro Gly 500 505 510 Ser Glu Val Ile Tyr Gly Asp Thr Asp Ser ValMet Val Lys Met Lys 515 520 525 Leu Pro Asp Asp Lys Val His Asp Met AspGlu Gln Phe Lys Met Ala 530 535 540 Lys Trp Leu Ala Gly Glu Ile Thr LysAsp Phe Arg Ala Pro Asn Asp 545 550 555 560 Leu Glu Phe Glu Lys Ile TyrTyr Pro Tyr Ile Leu Tyr Ser Lys Lys 565 570 575 Arg Tyr Ala Ala Val LysPhe Glu Glu Pro Asp Glu Lys Gly Lys Val 580 585 590 Asp Val Lys Gly LeuAla Leu Val Arg Arg Asp Phe Ser Pro Ile Thr 595 600 605 Arg Asp Ile LeuLys Glu Ser Leu Asp Thr Ile Leu Tyr Lys Lys Asp 610 615 620 Thr Pro ThrAla Val Ser Glu Thr Leu Glu Arg Ile Arg Lys Val Leu 625 630 635 640 AspAsn Glu Tyr Pro Met Glu Lys Phe Met Met Ser Lys Leu Leu Lys 645 650 655Thr Gly Tyr Lys Asn Glu Cys Gln Pro His Leu His Val Ala Asn Lys 660 665670 Ile Tyr Glu Arg Thr Gly Phe Pro Val Pro Ser Gly Ala Arg Val Pro 675680 685 Phe Val Tyr Ile Glu Asp Lys Lys Asn Pro Asp Ile Lys Gln Ser Phe690 695 700 Lys Ala Glu Asp Pro Thr Phe Ala Gln Asp Asn Gly Leu Ile ValAsp 705 710 715 720 Arg Leu Phe Tyr Ile Glu His Gln Leu Leu Lys Pro IleCys Ser Leu 725 730 735 Phe Glu Pro Leu Leu Asp Asp 740 14 773 PRTEpstein-barr virus (strain B95-8) 14 Gly Cys Arg Ile Phe Glu Ala Asn ValAsp Ala Thr Arg Arg Phe Val 1 5 10 15 Leu Asp Asn Asp Phe Val Thr PheGly Trp Tyr Ser Cys Arg Arg Ala 20 25 30 Ile Pro Arg Leu Gln His Arg AspSer Tyr Ala Glu Leu Glu Tyr Asp 35 40 45 Cys Glu Val Gly Asp Leu Ser ValArg Arg Glu Asp Ser Ser Trp Pro 50 55 60 Ser Tyr Gln Ala Leu Ala Phe AspIle Glu Cys Leu Gly Glu Glu Gly 65 70 75 80 Phe Pro Thr Ala Thr Asn GluAla Asp Leu Ile Leu Gln Ile Ser Cys 85 90 95 Val Leu Trp Ser Thr Gly GluGlu Ala Gly Arg Tyr Arg Arg Ile Leu 100 105 110 Leu Thr Leu Gly Thr CysGlu Asp Ile Glu Gly Val Glu Val Tyr Glu 115 120 125 Phe Pro Ser Glu LeuAsp Met Leu Tyr Ala Phe Phe Gln Leu Ile Arg 130 135 140 Asp Leu Ser ValGlu Ile Val Thr Gly Tyr Asn Val Ala Asn Phe Asp 145 150 155 160 Trp ProTyr Ile Leu Asp Arg Ala Arg His Ile Tyr Ser Ile Asn Pro 165 170 175 AlaSer Leu Gly Lys Ile Arg Ala Gly Gly Val Cys Glu Val Arg Arg 180 185 190Pro His Asp Ala Gly Lys Gly Phe Leu Arg Ala Asn Thr Lys Val Arg 195 200205 Ile Thr Gly Leu Ile Pro Ile Asp Met Tyr Ala Val Cys Arg Asp Lys 210215 220 Leu Ser Leu Ser Asp Tyr Lys Leu Asp Thr Val Ala Arg His Leu Leu225 230 235 240 Gly Ala Lys Lys Glu Asp Val His Tyr Lys Glu Ile Pro ArgLeu Phe 245 250 255 Ala Ala Gly Pro Glu Gly Arg Arg Arg Leu Gly Met TyrCys Val Gln 260 265 270 Asp Ser Ala Leu Val Met Asp Leu Leu Asn His PheVal Ile His Val 275 280 285 Glu Val Ala Glu Ile Ala Lys Ile Ala His IlePro Cys Arg Arg Val 290 295 300 Leu Asp Asp Gly Gln Gln Ile Arg Val PheSer Cys Leu Leu Ala Ala 305 310 315 320 Ala Gln Lys Glu Asn Phe Ile LeuPro Met Pro Ser Ala Ser Asp Arg 325 330 335 Asp Gly Tyr Gln Gly Ala ThrVal Ile Gln Pro Leu Ser Gly Phe Tyr 340 345 350 Asn Ser Pro Val Leu ValVal Asp Phe Ala Ser Leu Tyr Pro Ser Ile 355 360 365 Ile Gln Ala His AsnLeu Cys Tyr Ser Thr Met Ile Thr Pro Gly Glu 370 375 380 Glu His Arg LeuAla Gly Leu Arg Pro Gly Glu Asp Tyr Glu Ser Phe 385 390 395 400 Arg LeuThr Gly Gly Val Tyr His Phe Val Lys Lys His Val His Glu 405 410 415 SerPhe Leu Ala Ser Leu Leu Thr Ser Trp Leu Ala Lys Arg Lys Ala 420 425 430Ile Lys Lys Leu Leu Ala Ala Cys Glu Asp Pro Arg Gln Arg Thr Ile 435 440445 Leu Asp Lys Gln Gln Leu Ala Ile Lys Cys Thr Cys Asn Ala Val Tyr 450455 460 Gly Phe Thr Gly Val Ala Asn Gly Leu Phe Pro Cys Leu Ser Ile Ala465 470 475 480 Glu Thr Val Thr Leu Gln Gly Arg Thr Met Leu Glu Arg AlaLys Ala 485 490 495 Phe Val Glu Ala Leu Ser Pro Ala Asn Leu Gln Ala LeuAla Pro Ser 500 505 510 Pro Asp Ala Trp Ala Pro Leu Asn Pro Glu Gly GlnLeu Arg Val Ile 515 520 525 Tyr Gly Asp Thr Asp Ser Leu Phe Ile Glu CysArg Gly Phe Ser Glu 530 535 540 Ser Glu Thr Leu Arg Phe Ala Asp Ala LeuAla Ala His Thr Thr Arg 545 550 555 560 Ser Leu Phe Val Ala Pro Ile SerLeu Glu Ala Glu Lys Thr Phe Ser 565 570 575 Cys Leu Met Leu Ile Thr LysLys Arg Tyr Val Gly Val Leu Thr Asp 580 585 590 Gly Lys Thr Leu Met LysGly Val Glu Leu Val Arg Lys Thr Ala Cys 595 600 605 Lys Phe Val Gln ThrArg Cys Arg Arg Val Leu Asp Leu Val Leu Ala 610 615 620 Asp Ala Arg ValLys Glu Ala Ala Ser Leu Leu Ser His Arg Pro Phe 625 630 635 640 Gln GluSer Phe Thr Gln Gly Leu Pro Val Gly Phe Leu Pro Val Ile 645 650 655 AspIle Leu Asn Gln Ala Tyr Thr Asp Leu Arg Glu Gly Arg Val Pro 660 665 670Met Gly Glu Leu Cys Phe Ser Thr Glu Leu Ser Arg Lys Leu Ser Ala 675 680685 Tyr Lys Ser Thr Gln Met Pro His Leu Ala Val Tyr Gln Lys Phe Val 690695 700 Glu Arg Asn Glu Glu Leu Pro Gln Ile His Asp Arg Ile Gln Tyr Val705 710 715 720 Phe Val Glu Pro Lys Gly Gly Val Lys Gly Ala Arg Lys ThrGlu Met 725 730 735 Ala Glu Asp Pro Ala Tyr Ala Glu Arg His Gly Val ProVal Ala Val 740 745 750 Asp His Tyr Phe Asp Lys Leu Leu Gln Gly Ala AlaAsn Ile Leu Gln 755 760 765 Cys Leu Phe Asp Asn 770 15 764 PRTHerpesvirus saimiri (strain 11) 15 Gly Cys Glu Val Phe Glu Thr Asn ValAsp Ala Ile Arg Arg Phe Val 1 5 10 15 Ile Asp Asn Asp Phe Ser Thr PheGly Trp Tyr Thr Cys Lys Ser Ala 20 25 30 Cys Pro Arg Ile Thr Asn Arg AspSer His Thr Asp Ile Glu Phe Asp 35 40 45 Cys Gly Tyr Tyr Asp Leu Glu PheHis Ala Asp Arg Thr Glu Trp Pro 50 55 60 Pro Tyr Asn Ile Met Ser Phe AspIle Glu Cys Ile Gly Glu Lys Gly 65 70 75 80 Phe Pro Cys Ala Lys Asn GluGly Asp Leu Ile Ile Gln Ile Ser Cys 85 90 95 Val Phe Trp His Ala Gly AlaLeu Asp Thr Thr Arg Asn Met Leu Leu 100 105 110 Ser Leu Gly Thr Cys SerAla Val Glu Asn Thr Glu Val Tyr Glu Phe 115 120 125 Pro Ser Glu Ile AspMet Leu His Gly Phe Phe Ser Leu Ile Arg Asp 130 135 140 Phe Asn Val GluIle Ile Thr Gly Tyr Asn Ile Ser Asn Phe Asp Leu 145 150 155 160 Pro TyrLeu Ile Asp Arg Ala Thr Gln Ile Tyr Asn Ile Lys Leu Ser 165 170 175 AspTyr Ser Arg Val Lys Thr Gly Ser Ile Phe Gln Val His Thr Pro 180 185 190Lys Asp Thr Gly Asn Gly Phe Met Arg Ser Val Ser Lys Ile Lys Ile 195 200205 Ser Gly Ile Ile Ala Ile Asp Met Tyr Ile Val Cys Lys Asp Lys Leu 210215 220 Ser Leu Ser Asn Tyr Lys Leu Asp Thr Val Ala Asn His Cys Ile Gly225 230 235 240 Ala Lys Lys Glu Asp Val Ser Tyr Lys Asp Ile Met Pro LeuPhe Met 245 250 255 Ser Gly Pro Glu Gly Arg Ala Lys Ile Gly Leu Tyr CysVal Ile Asp 260 265 270 Ser Val Leu Val Met Lys Leu Leu Lys Phe Phe MetIle His Val Glu 275 280 285 Ile Ser Glu Ile Ala Lys Leu Ala Lys Ile ProThr Arg Arg Val Leu 290 295 300 Thr Asp Gly Gln Gln Ile Arg Val Phe SerCys Leu Leu Ala Ala Ala 305 310 315 320 Arg Ala Glu Asn Tyr Ile Leu ProVal Ser Asn Asp Val Asn Ala Asp 325 330 335 Gly Phe Gln Gly Ala Thr ValIle Asn Pro Ile Pro Gly Phe Tyr Asn 340 345 350 Asn Ala Val Leu Val ValAsp Phe Ala Ser Leu Tyr Pro Ser Ile Ile 355 360 365 Gln Ala His Asn LeuCys Tyr Ser Thr Leu Ile Pro His His Ala Leu 370 375 380 His Asn Tyr ProHis Leu Lys Ser Ser Asp Tyr Glu Thr Phe Met Leu 385 390 395 400 Ser SerGly Pro Ile His Phe Val Lys Lys His Ile Gln Ala Ser Leu 405 410 415 LeuSer Arg Leu Leu Thr Val Trp Leu Ser Lys Arg Lys Ala Ile Arg 420 425 430Gln Lys Leu Ala Glu Cys Glu Asp Leu Asp Thr Lys Thr Ile Leu Asp 435 440445 Lys Gln Gln Leu Ala Ile Lys Val Thr Cys Asn Ala Val Tyr Gly Phe 450455 460 Thr Gly Val Ala Ser Gly Leu Leu Pro Cys Ile Ser Ile Ala Glu Thr465 470 475 480 Val Thr Leu Gln Gly Arg Thr Met Leu Glu Lys Ser Lys IlePhe Ile 485 490 495 Glu Ala Met Thr Pro Asp Thr Leu Gln Glu Ile Val ProHis Ile Val 500 505 510 Lys His Glu Pro Asp Ala Lys Phe Arg Val Ile TyrGly Asp Thr Asp 515 520 525 Ser Leu Phe Val Glu Cys Val Gly Tyr Ser ValAsp Thr Val Val Lys 530 535 540 Phe Gly Asp Phe Leu Ala Ala Phe Thr SerGlu Lys Leu Phe Asn Ala 545 550 555 560 Pro Ile Lys Leu Glu Ser Glu LysThr Phe Gln Cys Leu Leu Leu Leu 565 570 575 Ala Lys Lys Arg Tyr Ile GlyIle Leu Ser Asn Asp Lys Leu Leu Met 580 585 590 Lys Gly Val Asp Leu ValArg Lys Thr Ala Cys Lys Phe Val Gln Asn 595 600 605 Thr Ser Ser Lys IleLeu Asn Leu Ile Leu Lys Asp Pro Glu Val Lys 610 615 620 Ala Ala Ala GlnLeu Leu Ser Thr Lys Asp Pro Asp Tyr Ala Phe Arg 625 630 635 640 Glu GlyLeu Pro Asp Gly Phe Leu Lys Val Ile Asp Ile Leu Asn Glu 645 650 655 SerHis Lys Asn Leu Arg Thr Gly Gln Val Pro Val Glu Glu Leu Thr 660 665 670Phe Ser Thr Glu Leu Ser Arg Pro Ile Ser Ser Tyr Lys Thr Glu Asn 675 680685 Leu Pro His Leu Thr Val Tyr Lys Lys Ile Ile Thr Arg His Glu Glu 690695 700 Pro Pro Gln Val His Asp Arg Ile Pro Tyr Val Phe Val Gly Lys Thr705 710 715 720 Thr Ser Cys Ile Ser Asn Met Ala Glu Asp Pro Thr Tyr ThrVal Gln 725 730 735 Asn Asn Ile Pro Ile Ala Val Asp Leu Tyr Phe Asp LysLeu Ile His 740 745 750 Gly Val Ala Asn Ile Ile Gln Cys Leu Phe Lys Asp755 760 16 892 PRT Herpes simplex virus (type 1/strain 17) 16 Pro AlaIle Lys Lys Tyr Glu Gly Gly Val Asp Ala Thr Thr Arg Phe 1 5 10 15 IleLeu Asp Asn Pro Gly Phe Val Thr Phe Gly Trp Tyr Arg Leu Lys 20 25 30 ProGly Arg Asn Asn Thr Leu Ala Gln Pro Ala Ala Pro Met Ala Phe 35 40 45 GlyThr Ser Ser Asp Val Glu Phe Asn Cys Thr Ala Asp Asn Leu Ala 50 55 60 IleGlu Gly Gly Met Ser Asp Leu Pro Ala Tyr Lys Leu Met Cys Phe 65 70 75 80Asp Ile Glu Cys Lys Ala Gly Gly Glu Asp Glu Leu Ala Phe Pro Val 85 90 95Ala Gly His Pro Glu Asp Leu Val Ile Gln Ile Ser Cys Leu Leu Tyr 100 105110 Asp Leu Ser Thr Thr Ala Leu Glu His Val Leu Leu Phe Ser Leu Gly 115120 125 Ser Cys Asp Leu Pro Glu Ser His Leu Asn Glu Leu Ala Ala Arg Gly130 135 140 Leu Pro Thr Pro Val Val Leu Glu Phe Asp Ser Glu Phe Glu MetLeu 145 150 155 160 Leu Ala Phe Met Thr Leu Val Lys Gln Tyr Gly Pro GluPhe Val Thr 165 170 175 Gly Tyr Asn Ile Ile Asn Phe Asp Trp Pro Phe LeuLeu Ala Lys Leu 180 185 190 Thr Asp Ile Tyr Lys Val Pro Leu Asp Gly TyrGly Arg Met Asn Gly 195 200 205 Arg Gly Val Phe Arg Val Trp Asp Ile GlyGln Ser His Phe Gln Lys 210 215 220 Arg Ser Lys Ile Lys Val Asn Gly MetVal Asn Ile Asp Met Tyr Gly 225 230 235 240 Ile Ile Thr Asp Lys Ile LysLeu Ser Ser Tyr Lys Leu Asn Ala Val 245 250 255 Ala Glu Ala Val Leu LysAsp Lys Lys Lys Asp Leu Ser Tyr Arg Asp 260 265 270 Ile Pro Ala Tyr TyrAla Ala Gly Pro Ala Gln Arg Gly Val Ile Gly 275 280 285 Glu Tyr Cys IleGln Asp Ser Leu Leu Val Gly Gln Leu Phe Phe Lys 290 295 300 Phe Leu ProHis Leu Glu Leu Ser Ala Val Ala Arg Leu Ala Gly Ile 305 310 315 320 AsnIle Thr Arg Thr Ile Tyr Asp Gly Gln Gln Ile Arg Val Phe Thr 325 330 335Cys Leu Leu Arg Leu Ala Asp Gln Lys Gly Phe Ile Leu Pro Asp Thr 340 345350 Gln Gly Arg Phe Arg Gly Ala Gly Gly Glu Ala Pro Lys Arg Pro Ala 355360 365 Ala Ala Arg Glu Asp Glu Glu Arg Pro Glu Glu Glu Gly Glu Asp Glu370 375 380 Asp Glu Arg Glu Glu Gly Gly Gly Glu Arg Glu Pro Glu Gly AlaArg 385 390 395 400 Glu Thr Ala Gly Arg His Val Gly Tyr Gln Gly Ala ArgVal Leu Asp 405 410 415 Pro Thr Ser Gly Phe His Val Asn Pro Val Val ValPhe Asp Phe Ala 420 425 430 Ser Leu Tyr Pro Ser Ile Ile Gln Ala His AsnLeu Cys Phe Ser Thr 435 440 445 Leu Ser Leu Arg Ala Asp Ala Val Ala HisLeu Glu Ala Gly Lys Asp 450 455 460 Tyr Leu Glu Ile Glu Val Gly Gly ArgArg Leu Phe Phe Val Lys Ala 465 470 475 480 His Val Arg Glu Ser Leu LeuSer Ile Leu Leu Arg Asp Trp Leu Ala 485 490 495 Met Arg Lys Gln Ile ArgSer Arg Ile Pro Gln Ser Ser Pro Glu Glu 500 505 510 Ala Val Leu Leu AspLys Gln Gln Ala Ala Ile Lys Val Val Cys Asn 515 520 525 Ser Val Tyr GlyPhe Thr Gly Val Gln His Gly Leu Leu Pro Cys Leu 530 535 540 His Val AlaAla Thr Val Thr Thr Ile Gly Arg Glu Met Leu Leu Ala 545 550 555 560 ThrArg Glu Tyr Val His Ala Arg Trp Ala Ala Phe Glu Gln Leu Leu 565 570 575Ala Asp Phe Pro Glu Ala Ala Asp Met Arg Ala Pro Gly Pro Tyr Ser 580 585590 Met Arg Ile Ile Tyr Gly Asp Thr Asp Ser Ile Phe Val Leu Cys Arg 595600 605 Gly Leu Thr Ala Ala Gly Leu Thr Ala Val Gly Asp Lys Met Ala Ser610 615 620 His Ile Ser Arg Ala Leu Phe Leu Pro Pro Ile Lys Leu Glu CysGlu 625 630 635 640 Lys Thr Phe Thr Lys Leu Leu Leu Ile Ala Lys Lys LysTyr Ile Gly 645 650 655 Val Ile Tyr Gly Gly Lys Met Leu Ile Lys Gly ValAsp Leu Val Arg 660 665 670 Lys Asn Asn Cys Ala Phe Ile Asn Arg Thr SerArg Ala Leu Val Asp 675 680 685 Leu Leu Phe Tyr Asp Asp Thr Val Ser GlyAla Ala Ala Ala Leu Ala 690 695 700 Glu Arg Pro Ala Glu Glu Trp Leu AlaArg Pro Leu Pro Glu Gly Leu 705 710 715 720 Gln Ala Phe Gly Ala Val LeuVal Asp Ala His Arg Arg Ile Thr Asp 725 730 735 Pro Glu Arg Asp Ile GlnAsp Phe Val Leu Thr Ala Glu Leu Ser Arg 740 745 750 His Pro Arg Ala TyrThr Asn Lys Arg Leu Ala His Leu Thr Val Tyr 755 760 765 Tyr Lys Leu MetAla Arg Arg Ala Gln Val Pro Ser Ile Lys Asp Arg 770 775 780 Ile Pro TyrVal Ile Val Ala Gln Thr Arg Glu Val Glu Glu Thr Val 785 790 795 800 AlaArg Leu Ala Ala Leu Arg Glu Leu Asp Ala Ala Ala Pro Gly Asp 805 810 815Glu Pro Ala Pro Pro Ala Ala Leu Pro Ser Pro Ala Lys Arg Pro Arg 820 825830 Glu Thr Pro Ser Pro Ala Asp Pro Pro Gly Gly Ala Ser Lys Pro Arg 835840 845 Lys Leu Leu Val Ser Glu Leu Ala Glu Asp Pro Ala Tyr Ala Ile Ala850 855 860 His Gly Val Ala Leu Asn Thr Asp Tyr Tyr Phe Ser His Leu LeuGly 865 870 875 880 Ala Ala Cys Val Thr Phe Lys Ala Leu Phe Gly Asn 885890 17 896 PRT Herpes simplex virus (type 2/strain 186) 17 Pro Ala IleArg Lys Tyr Glu Gly Gly Val Asp Ala Thr Thr Arg Phe 1 5 10 15 Ile LeuAsp Asn Pro Gly Phe Val Thr Phe Gly Trp Tyr Arg Leu Lys 20 25 30 Pro GlyArg Gly Asn Ala Pro Ala Gln Pro Arg Pro Pro Thr Ala Phe 35 40 45 Gly ThrSer Ser Asp Val Glu Phe Asn Cys Thr Ala Asp Asn Leu Ala 50 55 60 Val GluGly Ala Met Cys Asp Leu Pro Ala Tyr Lys Leu Met Cys Phe 65 70 75 80 AspIle Glu Cys Lys Ala Gly Gly Glu Asp Glu Leu Ala Phe Pro Val 85 90 95 AlaGlu Arg Pro Glu Asp Leu Val Ile Gln Ile Ser Cys Leu Leu Tyr 100 105 110Asp Leu Ser Thr Thr Ala Leu Glu His Ile Leu Leu Phe Ser Leu Gly 115 120125 Ser Cys Asp Leu Pro Glu Ser His Leu Ser Asp Leu Ala Ser Arg Gly 130135 140 Leu Pro Ala Pro Val Val Leu Glu Phe Asp Ser Glu Phe Glu Met Leu145 150 155 160 Leu Ala Phe Met Thr Phe Val Lys Gln Tyr Gly Pro Glu PheVal Thr 165 170 175 Gly Tyr Asn Ile Ile Asn Phe Asp Trp Pro Phe Val LeuThr Lys Leu 180 185 190 Thr Glu Ile Tyr Lys Val Pro Leu Asp Gly Tyr GlyArg Met Asn Gly 195 200 205 Arg Gly Val Phe Arg Val Trp Asp Ile Gly GlnSer His Phe Gln Lys 210 215 220 Arg Ser Lys Ile Lys Val Asn Gly Met ValAsn Ile Asp Met Tyr Gly 225 230 235 240 Ile Ile Thr Asp Lys Val Lys LeuSer Ser Tyr Lys Leu Asn Ala Val 245 250 255 Ala Glu Ala Val Leu Lys AspLys Lys Lys Asp Leu Ser Tyr Arg Asp 260 265 270 Ile Pro Ala Tyr Tyr AlaSer Gly Pro Ala Gln Arg Gly Val Ile Gly 275 280 285 Glu Tyr Cys Val GlnAsp Ser Leu Leu Val Gly Gln Leu Phe Phe Lys 290 295 300 Phe Leu Pro HisLeu Glu Leu Ser Ala Val Ala Arg Leu Ala Gly Ile 305 310 315 320 Asn IleThr Arg Thr Ile Tyr Asp Gly Gln Gln Ile Arg Val Phe Thr 325 330 335 CysLeu Leu Arg Leu Ala Gly Gln Lys Gly Phe Ile Leu Pro Asp Thr 340 345 350Gln Gly Arg Phe Arg Gly Leu Asp Lys Glu Ala Pro Lys Arg Pro Ala 355 360365 Val Pro Arg Gly Glu Gly Glu Arg Pro Gly Asp Gly Asn Gly Asp Glu 370375 380 Asp Lys Asp Asp Asp Glu Asp Gly Asp Glu Asp Gly Asp Glu Arg Glu385 390 395 400 Glu Val Ala Arg Glu Thr Gly Gly Arg His Val Gly Tyr GlnGly Ala 405 410 415 Arg Val Leu Asp Pro Thr Ser Gly Phe His Val Asp ProVal Val Val 420 425 430 Phe Asp Phe Ala Ser Leu Tyr Pro Ser Ile Ile GlnAla His Asn Leu 435 440 445 Cys Phe Ser Thr Leu Ser Leu Arg Pro Glu AlaVal Ala His Leu Glu 450 455 460 Ala Asp Arg Asp Tyr Leu Glu Ile Glu ValGly Gly Arg Arg Leu Phe 465 470 475 480 Phe Val Lys Ala His Val Arg GluSer Leu Leu Ser Ile Leu Leu Arg 485 490 495 Asp Trp Leu Ala Met Arg LysGln Ile Arg Ser Arg Ile Pro Gln Ser 500 505 510 Pro Pro Glu Glu Ala ValLeu Leu Asp Lys Gln Gln Ala Ala Ile Lys 515 520 525 Val Val Cys Asn SerVal Tyr Gly Phe Thr Gly Val Gln His Gly Leu 530 535 540 Leu Pro Cys LeuHis Val Ala Ala Thr Val Thr Thr Ile Gly Arg Glu 545 550 555 560 Met LeuLeu Ala Thr Arg Ala Tyr Val His Ala Arg Trp Ala Glu Phe 565 570 575 AspGln Leu Leu Ala Asp Phe Pro Glu Ala Ala Gly Met Arg Ala Pro 580 585 590Gly Pro Tyr Ser Met Arg Ile Ile Tyr Gly Asp Thr Asp Ser Ile Phe 595 600605 Val Leu Cys Arg Gly Leu Thr Gly Glu Ala Leu Val Ala Met Gly Asp 610615 620 Lys Met Ala Ser His Ile Ser Arg Ala Leu Phe Leu Pro Pro Ile Lys625 630 635 640 Leu Glu Cys Glu Lys Thr Phe Thr Lys Leu Leu Leu Ile AlaLys Lys 645 650 655 Lys Tyr Ile Gly Val Ile Cys Gly Gly Lys Met Leu IleLys Gly Val 660 665 670 Asp Leu Val Arg Lys Asn Asn Cys Ala Phe Ile AsnArg Thr Ser Arg 675 680 685 Ala Leu Val Asp Leu Leu Phe Tyr Asp Asp ThrVal Ser Gly Ala Ala 690 695 700 Ala Ala Leu Ala Glu Arg Pro Ala Glu GluTrp Leu Ala Arg Pro Leu 705 710 715 720 Pro Glu Gly Leu Gln Ala Phe GlyAla Val Leu Val Asp Ala His Arg 725 730 735 Arg Ile Thr Asp Pro Glu ArgAsp Ile Gln Asp Phe Val Leu Thr Ala 740 745 750 Glu Leu Ser Arg His ProArg Ala Tyr Thr Asn Lys Arg Leu Ala His 755 760 765 Leu Thr Val Tyr TyrLys Leu Met Ala Arg Arg Ala Gln Val Pro Ser 770 775 780 Ile Lys Asp ArgIle Pro Tyr Val Ile Val Ala Gln Thr Arg Glu Val 785 790 795 800 Glu GluThr Val Ala Arg Leu Ala Ala Leu Arg Glu Leu Asp Ala Ala 805 810 815 AlaPro Gly Asp Glu Pro Ala Pro Pro Ala Ala Leu Pro Ser Pro Ala 820 825 830Lys Arg Pro Arg Glu Thr Pro Ser His Ala Asp Pro Pro Gly Gly Ala 835 840845 Ser Lys Pro Arg Lys Leu Leu Val Ser Glu Leu Ala Glu Asp Pro Gly 850855 860 Tyr Ala Ile Ala Arg Gly Val Pro Leu Asn Thr Asp Tyr Tyr Phe Ser865 870 875 880 His Leu Leu Gly Ala Ala Cys Val Thr Phe Lys Ala Leu PheGly Asn 885 890 895 18 875 PRT Equine herpesvirus type 1 (strain Ab4p)18 Pro Glu Ile Thr Lys Phe Glu Gly Ser Val Asp Val Thr Thr Arg Leu 1 510 15 Leu Leu Asp Asn Glu Asn Phe Thr Ser Phe Gly Trp Tyr Arg Leu Arg 2025 30 Pro Gly Thr His Gly Glu Arg Val Gln Leu Arg Pro Val Glu Arg His 3540 45 Val Thr Ser Ser Asp Val Glu Ile Asn Cys Thr Pro Asp Asn Leu Glu 5055 60 Pro Ile Pro Asp Glu Ala Ala Trp Pro Asp Tyr Lys Leu Met Cys Phe 6570 75 80 Asp Ile Glu Cys Lys Ala Gly Thr Gly Asn Glu Met Ala Phe Pro Val85 90 95 Ala Thr Asn Gln Glu Asp Leu Val Ile Gln Ile Ser Cys Leu Leu Tyr100 105 110 Ser Leu Ala Thr Gln Asn His Glu His Thr Leu Leu Phe Ser LeuGly 115 120 125 Ser Cys Asp Ile Ser Glu Glu Tyr Ser Phe Ala Cys Val GlnArg Gly 130 135 140 Glu Pro Arg Pro Thr Val Leu Glu Phe Asp Ser Glu TyrGlu Leu Leu 145 150 155 160 Val Ala Phe Leu Thr Phe Leu Lys Gln Tyr SerPro Glu Phe Ala Thr 165 170 175 Gly Tyr Asn Ile Val Asn Phe Asp Trp AlaTyr Ile Val Asn Lys Val 180 185 190 Thr Ser Val Tyr Asn Ile Lys Leu AspGly Tyr Gly Lys Phe Asn Lys 195 200 205 Gly Gly Leu Phe Lys Val Trp AspIle Ala Thr Asn His Phe Gln Lys 210 215 220 Lys Ser Lys Val Lys Ile AsnGly Leu Ile Ser Leu Asp Met Tyr Ser 225 230 235 240 Val Ala Thr Glu LysLeu Lys Leu Pro Ser Tyr Lys Leu Asp Ala Val 245 250 255 Val Gly Asp ValLeu Gly Glu His Lys Ile Asp Leu Pro Tyr Lys Glu 260 265 270 Ile Pro SerTyr Tyr Ala Gly Gly Pro Asp Arg Arg Gly Val Ile Gly 275 280 285 Glu TyrCys Ile Gln Asp Ser Arg Leu Val Gly Lys Leu Phe Phe Lys 290 295 300 TyrLeu Pro His Leu Glu Leu Ser Ala Val Ala Lys Leu Ala Arg Ile 305 310 315320 Thr Leu Thr Arg Val Ile Phe Asp Gly Gln Gln Ile Arg Val Tyr Thr 325330 335 Cys Leu Leu Lys Leu Ala Arg Glu Arg Asn Phe Ile Leu Pro Asp Asn340 345 350 Arg Arg Arg Phe Asp Ser Gln Ala Asp Ala Ala Ser Glu Thr SerGlu 355 360 365 Leu Ala Met Asp Ser Gln Ser His Ala Phe Asp Ser Thr AspGlu Pro 370 375 380 Asp Gly Val Asp Gly Thr Pro Asp Ala Ala Gly Ser GlyAla Thr Ser 385 390 395 400 Glu Asn Gly Gly Gly Lys Pro Gly Val Gly ArgAla Val Gly Tyr Gln 405 410 415 Gly Ala Lys Val Leu Asp Pro Val Ser GlyPhe His Val Asp Pro Val 420 425 430 Val Val Phe Asp Phe Ala Ser Leu TyrPro Ser Ile Ile Gln Ala His 435 440 445 Asn Leu Cys Phe Thr Thr Leu AlaLeu Asp Glu Val Asp Leu Ala Gly 450 455 460 Leu Gln Pro Ser Val Asp TyrSer Thr Phe Glu Val Gly Asp Gln Lys 465 470 475 480 Leu Phe Phe Val HisAla His Ile Arg Glu Ser Leu Leu Gly Ile Leu 485 490 495 Leu Arg Asp TrpLeu Ala Met Arg Lys Ala Val Arg Ala Arg Ile Pro 500 505 510 Thr Ser ThrPro Glu Glu Ala Val Leu Leu Asp Lys Gln Gln Ser Ala 515 520 525 Ile LysVal Ile Cys Asn Ser Val Tyr Gly Phe Thr Gly Val Ala Asn 530 535 540 GlyLeu Leu Pro Cys Leu Arg Ile Ala Ala Thr Val Thr Thr Ile Gly 545 550 555560 Arg Asp Met Leu Leu Lys Thr Arg Asp Tyr Val His Ser Arg Trp Ala 565570 575 Thr Arg Glu Leu Leu Glu Asp Asn Phe Pro Gly Ala Ile Gly Phe Arg580 585 590 Asn His Lys Pro Tyr Ser Val Arg Val Ile Tyr Gly Asp Thr AspSer 595 600 605 Val Phe Ile Lys Phe Val Gly Leu Thr Tyr Glu Gly Val SerGlu Leu 610 615 620 Gly Asp Ala Met Ser Arg Gln Ile Ser Ala Asp Leu PheArg Ala Pro 625 630 635 640 Ile Lys Leu Glu Cys Glu Lys Thr Phe Gln ArgLeu Leu Leu Ile Thr 645 650 655 Lys Lys Lys Tyr Ile Gly Val Ile Asn GlyGly Lys Met Leu Met Lys 660 665 670 Gly Val Asp Leu Val Arg Lys Asn AsnCys Ser Phe Ile Asn Leu Tyr 675 680 685 Ala Arg His Leu Val Asp Leu LeuLeu Tyr Asp Glu Asp Val Ala Thr 690 695 700 Ala Ala Ala Glu Val Thr AspVal Pro Pro Ala Glu Trp Val Gly Arg 705 710 715 720 Pro Leu Pro Ser GlyPhe Asp Lys Phe Gly Arg Val Leu Val Glu Ala 725 730 735 Tyr Asn Arg IleThr Ala Pro Asn Leu Asp Val Arg Glu Phe Val Met 740 745 750 Thr Ala GluLeu Ser Arg Ser Pro Glu Ser Tyr Thr Asn Lys Arg Leu 755 760 765 Pro HisLeu Thr Val Tyr Phe Lys Leu Ala Met Arg Asn Glu Glu Leu 770 775 780 ProSer Val Lys Glu Arg Ile Pro Tyr Val Ile Val Ala Gln Thr Glu 785 790 795800 Ala Ala Glu Arg Glu Ala Gly Val Val Asn Ser Met Arg Gly Thr Ala 805810 815 Gln Asn Pro Val Val Thr Lys Thr Ala Arg Pro Gln Pro Lys Arg Lys820 825 830 Leu Leu Val Ser Asp Leu Ala Glu Asp Pro Thr Tyr Val Ser GluAsn 835 840 845 Asp Val Pro Leu Asn Thr Asp Tyr Tyr Phe Ser His Leu LeuGly Thr 850 855 860 Ile Ser Val Thr Phe Lys Ala Leu Phe Gly Asn 865 870875 19 852 PRT Varicella-zoster virus (strain Dumas) 19 Pro Glu Leu LysLys Tyr Glu Gly Arg Val Asp Ala Thr Thr Arg Phe 1 5 10 15 Leu Met AspAsn Pro Gly Phe Val Ser Phe Gly Trp Tyr Gln Leu Lys 20 25 30 Pro Gly ValAsp Gly Glu Arg Val Arg Val Arg Pro Ala Ser Arg Gln 35 40 45 Leu Thr LeuSer Asp Val Glu Ile Asp Cys Met Ser Asp Asn Leu Gln 50 55 60 Ala Ile ProAsn Asp Asp Ser Trp Pro Asp Tyr Lys Leu Leu Cys Phe 65 70 75 80 Asp IleGlu Cys Lys Ser Gly Gly Ser Asn Glu Leu Ala Phe Pro Asp 85 90 95 Ala ThrHis Leu Glu Asp Leu Val Ile Gln Ile Ser Cys Leu Leu Tyr 100 105 110 SerIle Pro Arg Gln Ser Leu Glu His Ile Leu Leu Phe Ser Leu Gly 115 120 125Ser Cys Asp Leu Pro Gln Arg Tyr Val Gln Glu Met Lys Asp Ala Gly 130 135140 Leu Pro Glu Pro Thr Val Leu Glu Phe Asp Ser Glu Phe Glu Leu Leu 145150 155 160 Ile Ala Phe Met Thr Leu Val Lys Gln Tyr Ala Pro Glu Phe AlaThr 165 170 175 Gly Tyr Asn Ile Val Asn Phe Asp Trp Ala Phe Ile Met GluLys Leu 180 185 190 Asn Ser Ile Tyr Ser Leu Lys Leu Asp Gly Tyr Gly SerIle Asn Arg 195 200 205 Gly Gly Leu Phe Lys Ile Trp Asp Val Gly Lys SerGly Phe Gln Arg 210 215 220 Arg Ser Lys Val Lys Ile Asn Gly Leu Ile SerLeu Asp Met Tyr Ala 225 230 235 240 Ile Ala Thr Glu Lys Leu Lys Leu SerSer Tyr Lys Leu Asp Ser Val 245 250 255 Ala Arg Glu Ala Leu Asn Glu SerLys Arg Asp Leu Pro Tyr Lys Asp 260 265 270 Ile Pro Gly Tyr Tyr Ala SerGly Pro Asn Thr Arg Gly Ile Ile Gly 275 280 285 Glu Tyr Cys Ile Gln AspSer Ala Leu Val Gly Lys Leu Phe Phe Lys 290 295 300 Tyr Leu Pro His LeuGlu Leu Ser Ala Val Ala Arg Leu Ala Arg Ile 305 310 315 320 Thr Leu ThrLys Ala Ile Tyr Asp Gly Gln Gln Val Arg Ile Tyr Thr 325 330 335 Cys LeuLeu Gly Leu Ala Ser Ser Arg Gly Phe Ile Leu Pro Asp Gly 340 345 350 GlyTyr Pro Ala Thr Phe Glu Tyr Lys Asp Val Ile Pro Asp Val Gly 355 360 365Asp Val Glu Glu Glu Met Asp Glu Asp Glu Ser Val Ser Pro Thr Gly 370 375380 Thr Ser Ser Gly Arg Asn Val Gly Tyr Lys Gly Ala Arg Val Phe Asp 385390 395 400 Pro Asp Thr Gly Phe Tyr Ile Asp Pro Val Val Val Leu Asp PheAla 405 410 415 Ser Leu Tyr Pro Ser Ile Ile Gln Ala His Asn Leu Cys PheThr Thr 420 425 430 Leu Thr Leu Asn Phe Glu Thr Val Lys Arg Leu Asn ProSer Asp Tyr 435 440 445 Ala Thr Phe Thr Val Gly Gly Lys Arg Leu Phe PheVal Arg Ser Asn 450 455 460 Val Arg Glu Ser Leu Leu Gly Val Leu Leu LysAsp Trp Leu Ala Met 465 470 475 480 Arg Lys Ala Ile Arg Ala Arg Ile ProGly Ser Ser Ser Asp Glu Ala 485 490 495 Val Leu Leu Asp Lys Gln Gln AlaAla Ile Lys Val Val Cys Asn Ser 500 505 510 Val Tyr Gly Phe Thr Gly ValAla Gln Gly Phe Leu Pro Cys Leu Tyr 515 520 525 Val Ala Ala Thr Val ThrThr Ile Gly Arg Gln Met Leu Leu Ser Thr 530 535 540 Arg Asp Tyr Ile HisAsn Asn Trp Ala Ala Phe Glu Arg Phe Ile Thr 545 550 555 560 Ala Phe ProAsp Ile Glu Ser Ser Val Leu Ser Gln Lys Ala Tyr Glu 565 570 575 Val LysVal Ile Tyr Gly Asp Thr Asp Ser Val Phe Ile Arg Phe Lys 580 585 590 GlyVal Ser Val Glu Gly Ile Ala Lys Ile Gly Glu Lys Met Ala His 595 600 605Ile Ile Ser Thr Ala Leu Phe Cys Pro Pro Ile Lys Leu Glu Cys Glu 610 615620 Lys Thr Phe Ile Lys Leu Leu Leu Ile Thr Lys Lys Lys Tyr Ile Gly 625630 635 640 Val Ile Tyr Gly Gly Lys Val Leu Met Lys Gly Val Asp Leu ValArg 645 650 655 Lys Asn Asn Cys Gln Phe Ile Asn Asp Tyr Ala Arg Lys LeuVal Glu 660 665 670 Leu Leu Leu Tyr Asp Asp Thr Val Ser Arg Ala Ala AlaGlu Ala Ser 675 680 685 Cys Val Ser Ile Ala Glu Trp Asn Arg Arg Ala MetPro Ser Gly Met 690 695 700 Ala Gly Phe Gly Arg Ile Ile Ala Asp Ala HisArg Gln Ile Thr Ser 705 710 715 720 Pro Lys Leu Asp Ile Asn Lys Phe ValMet Thr Ala Glu Leu Ser Arg 725 730 735 Pro Pro Ser Ala Tyr Ile Asn ArgArg Leu Ala His Leu Thr Val Tyr 740 745 750 Tyr Lys Leu Val Met Arg GlnGly Gln Ile Pro Asn Val Arg Glu Arg 755 760 765 Ile Pro Tyr Val Ile ValAla Pro Thr Asp Glu Val Glu Ala Asp Ala 770 775 780 Lys Ser Val Ala LeuLeu Arg Gly Asp Pro Leu Gln Asn Thr Ala Gly 785 790 795 800 Lys Arg CysGly Glu Ala Lys Arg Lys Leu Ile Ile Ser Asp Leu Ala 805 810 815 Glu AspPro Ile His Val Thr Ser His Gly Leu Ser Leu Asn Ile Asp 820 825 830 TyrTyr Phe Ser His Leu Ile Gly Thr Ala Ser Val Thr Phe Lys Ala 835 840 845Leu Phe Gly Asn 850 20 978 PRT Human cytomegalovirus (strain AD169) 20Gly Phe Pro Val Tyr Glu Val Arg Val Asp Pro Leu Thr Arg Leu Val 1 5 1015 Ile Asp Arg Arg Ile Thr Thr Phe Gly Trp Cys Ser Val Asn Arg Tyr 20 2530 Asp Trp Arg Gln Gln Gly Arg Ala Ser Thr Cys Asp Ile Glu Val Asp 35 4045 Cys Asp Val Ser Asp Leu Val Ala Val Pro Asp Asp Ser Ser Trp Pro 50 5560 Arg Tyr Arg Cys Leu Ser Phe Asp Ile Glu Cys Met Ser Gly Glu Gly 65 7075 80 Gly Phe Pro Cys Ala Glu Lys Ser Asp Asp Ile Val Ile Gln Ile Ser 8590 95 Cys Val Cys Tyr Glu Thr Gly Gly Asn Thr Ala Val Asp Gln Gly Ile100 105 110 Pro Asn Gly Asn Asp Gly Arg Gly Cys Thr Ser Glu Gly Val IlePhe 115 120 125 Gly His Ser Gly Leu His Leu Phe Thr Ile Gly Thr Cys GlyGln Val 130 135 140 Gly Pro Asp Val Asp Val Tyr Glu Phe Pro Ser Glu TyrGlu Leu Leu 145 150 155 160 Leu Gly Phe Met Leu Phe Phe Gln Arg Tyr AlaPro Ala Phe Val Thr 165 170 175 Gly Tyr Asn Ile Asn Ser Phe Asp Leu LysTyr Ile Leu Thr Arg Leu 180 185 190 Glu Tyr Leu Tyr Lys Val Asp Ser GlnArg Phe Cys Lys Leu Pro Thr 195 200 205 Ala Gln Gly Gly Arg Phe Phe LeuHis Ser Pro Ala Val Gly Phe Lys 210 215 220 Arg Gln Tyr Ala Ala Ala PhePro Ser Ala Ser His Asn Asn Pro Ala 225 230 235 240 Ser Thr Ala Ala ThrLys Val Tyr Ile Ala Gly Ser Val Val Ile Asp 245 250 255 Met Tyr Pro ValCys Met Ala Lys Thr Asn Ser Pro Asn Tyr Lys Leu 260 265 270 Asn Thr MetAla Glu Leu Tyr Leu Arg Gln Arg Lys Asp Asp Leu Ser 275 280 285 Tyr LysAsp Ile Pro Arg Cys Phe Val Ala Asn Ala Glu Gly Arg Ala 290 295 300 GlnVal Gly Arg Tyr Cys Leu Gln Asp Ala Val Leu Val Arg Asp Leu 305 310 315320 Phe Asn Thr Ile Asn Phe His Tyr Glu Ala Gly Ala Ile Ala Arg Leu 325330 335 Ala Lys Ile Pro Leu Arg Arg Val Ile Phe Asp Gly Gln Gln Ile Arg340 345 350 Ile Tyr Thr Ser Leu Leu Asp Glu Cys Ala Cys Arg Asp Phe IleLeu 355 360 365 Pro Asn His Tyr Ser Lys Gly Thr Thr Val Pro Glu Thr AsnSer Val 370 375 380 Ala Val Ser Pro Asn Ala Ala Ile Ile Ser Thr Ala AlaVal Pro Gly 385 390 395 400 Asp Ala Gly Ser Val Ala Ala Met Phe Gln MetSer Pro Pro Leu Gln 405 410 415 Ser Ala Pro Ser Ser Gln Asp Gly Val SerPro Gly Ser Gly Ser Asn 420 425 430 Ser Ser Ser Ser Val Gly Val Phe SerVal Gly Ser Gly Ser Ser Gly 435 440 445 Gly Val Gly Val Ser Asn Asp AsnHis Gly Ala Gly Gly Thr Ala Ala 450 455 460 Val Ser Tyr Gln Gly Ala ThrVal Phe Glu Pro Glu Val Gly Tyr Tyr 465 470 475 480 Asn Asp Pro Val AlaVal Phe Asp Phe Ala Ser Leu Tyr Pro Ser Ile 485 490 495 Ile Met Ala HisAsn Leu Cys Tyr Ser Thr Leu Leu Val Pro Gly Gly 500 505 510 Glu Tyr ProVal Asp Pro Ala Asp Val Tyr Ser Val Thr Leu Glu Asn 515 520 525 Gly ValThr His Arg Phe Val Arg Ala Ser Val Arg Val Ser Val Leu 530 535 540 SerGlu Leu Leu Asn Lys Trp Val Ser Gln Arg Arg Ala Val Arg Glu 545 550 555560 Cys Met Arg Glu Cys Gln Asp Pro Val Arg Arg Met Leu Leu Asp Lys 565570 575 Glu Gln Met Ala Leu Lys Val Thr Cys Asn Ala Phe Tyr Gly Phe Thr580 585 590 Gly Val Val Asn Gly Met Met Pro Cys Leu Pro Ile Ala Ala SerIle 595 600 605 Thr Arg Ile Gly Arg Asp Met Leu Glu Arg Thr Ala Arg PheIle Lys 610 615 620 Asp Asn Phe Ser Glu Pro Cys Phe Leu His Asn Phe PheAsn Gln Glu 625 630 635 640 Asp Tyr Val Val Gly Thr Arg Glu Gly Asp SerGlu Glu Ser Ser Ala 645 650 655 Leu Pro Glu Gly Leu Glu Thr Ser Ser GlyGly Ser Asn Glu Arg Arg 660 665 670 Val Glu Ala Arg Val Ile Tyr Gly AspThr Asp Ser Val Phe Val Arg 675 680 685 Phe Arg Gly Leu Thr Pro Gln AlaLeu Val Ala Arg Gly Pro Ser Leu 690 695 700 Ala His Tyr Val Thr Ala CysLeu Phe Val Glu Pro Val Lys Leu Glu 705 710 715 720 Phe Glu Lys Val PheVal Ser Leu Met Met Ile Cys Lys Lys Arg Tyr 725 730 735 Ile Gly Lys ValGlu Gly Ala Ser Gly Leu Ser Met Lys Gly Val Asp 740 745 750 Leu Val ArgLys Thr Ala Cys Glu Phe Val Lys Gly Val Thr Arg Asp 755 760 765 Val LeuSer Leu Leu Phe Glu Asp Arg Glu Val Ser Glu Ala Ala Val 770 775 780 ArgLeu Ser Arg Leu Ser Leu Asp Glu Val Lys Lys Tyr Gly Val Pro 785 790 795800 Arg Gly Phe Trp Arg Ile Leu Arg Arg Leu Val Gln Ala Arg Asp Asp 805810 815 Leu Tyr Leu His Arg Val Arg Val Glu Asp Leu Val Leu Ser Ser Val820 825 830 Leu Ser Lys Asp Ile Ser Leu Tyr Arg Gln Ser Asn Leu Pro HisIle 835 840 845 Ala Val Ile Lys Arg Leu Ala Ala Arg Ser Glu Glu Leu ProSer Val 850 855 860 Gly Asp Arg Val Phe Tyr Val Leu Thr Ala Pro Gly ValArg Thr Ala 865 870 875 880 Pro Gln Gly Ser Ser Asp Asn Gly Asp Ser ValThr Ala Gly Val Val 885 890 895 Ser Arg Ser Asp Ala Ile Asp Gly Thr AspAsp Asp Ala Asp Gly Gly 900 905 910 Gly Val Glu Glu Ser Asn Arg Arg GlyGly Glu Pro Ala Lys Lys Arg 915 920 925 Ala Arg Lys Pro Pro Ser Ala ValCys Asn Tyr Glu Val Ala Glu Asp 930 935 940 Pro Ser Tyr Val Arg Glu HisGly Val Pro Ile His Ala Asp Lys Tyr 945 950 955 960 Phe Glu Gln Val LeuLys Ala Val Thr Asn Val Leu Ser Pro Val Phe 965 970 975 Pro Gly 21 814PRT Murine cytomegalovirus (strain Smith) 21 Gly Arg Lys Val Tyr Glu LeuGly Val Asp Pro Leu Ala Arg Phe Leu 1 5 10 15 Ile Asp Arg Lys Ile ProSer Phe Gly Trp Cys Leu Ala Arg Arg Tyr 20 25 30 Ser Val Arg Ala Ala GlyTyr Val Ser Arg Ala Gln Leu Glu Ile Asp 35 40 45 Cys Asp Val Ala Asp IleLeu Pro Ile Glu Glu Gln Ser Asn Trp Pro 50 55 60 Phe Tyr Arg Cys Leu SerPhe Asp Ile Glu Cys Met Ser Gly Thr Gly 65 70 75 80 Ala Phe Pro Ala AlaGlu Asn Val Asp Asp Ile Ile Ile Gln Ile Ser 85 90 95 Cys Val Cys Phe GlyVal Gly Glu Met Val His His Ala Tyr Asp Val 100 105 110 His Ala Asp LeuSer Thr Pro Ala Val Pro Glu Asn His Leu Phe Thr 115 120 125 Ile Gly ProCys Ala Pro Ile Pro Asp Val Lys Ile Tyr Thr Phe Pro 130 135 140 Ser GluTyr Glu Met Leu Arg Gly Phe Phe Ile Phe Leu Ser Trp Tyr 145 150 155 160Ser Pro Glu Phe Ile Thr Gly Tyr Asn Ile Asn Gly Phe Asp Ile Lys 165 170175 Tyr Ile Leu Thr Arg Ala Glu Lys Leu Tyr Lys Met Asp Val Gly Gln 180185 190 Phe Thr Lys Leu Arg Arg Gly Gly Arg Met Phe Val Phe Ser Pro Glu195 200 205 Lys Gly Lys Ala Gly Phe Gly Thr Ser Asn Thr Val Lys Val PheTrp 210 215 220 Ser Gly Thr Val Val Leu Asp Met Tyr Pro Val Cys Thr AlaLys Ala 225 230 235 240 Ser Ser Pro Asn Tyr Lys Leu Asp Thr Met Ala GluIle Tyr Leu Lys 245 250 255 Lys Lys Lys Asp Asp Leu Ser Tyr Lys Glu IlePro Val Gln Phe Ser 260 265 270 Ala Gly Asp Glu Gly Arg Ala Pro Gly GlyLys Tyr Cys Leu Gln Asp 275 280 285 Ala Val Leu Val Arg Glu Leu Phe GluMet Leu Ala Phe His Phe Glu 290 295 300 Ala Ala Ala Ile Ala Arg Leu AlaArg Ile Pro Leu Arg Lys Val Ile 305 310 315 320 Phe Asp Gly Gln Gln IleArg Ile Tyr Thr Cys Leu Leu Glu Glu Cys 325 330 335 Ser Gly Arg Asp MetIle Leu Pro Asn Met Pro Ser Leu Gly His Gly 340 345 350 Ala Ala Ala AlaIle Glu Glu Ala Ala Ala Gly Gly Glu Gly Asp Glu 355 360 365 Thr Ser GluGly Glu Asn Ser Asn Asn Ser Arg Thr Val Gly Tyr Gln 370 375 380 Gly AlaThr Val Leu Glu Pro Glu Cys Gly Phe His His Val Pro Val 385 390 395 400Cys Val Phe Asp Phe Ala Ser Leu Tyr Pro Ser Ile Ile Met Ser Asn 405 410415 Asn Leu Cys Tyr Ser Thr Leu Leu Val Glu Gly Ser Pro Glu Val Pro 420425 430 Glu Lys Asp Val Leu Arg Val Glu Ile Gly Asp Gln Cys His Arg Phe435 440 445 Val Arg Glu Asn Val His Arg Ser Leu Leu Ala Glu Leu Leu ValArg 450 455 460 Trp Leu Thr Gln Arg Lys Leu Val Arg Glu Ala Met Lys GlnCys Thr 465 470 475 480 Asn Glu Met Gln Arg Met Ile Met Asp Lys Gln GlnLeu Ala Leu Lys 485 490 495 Val Thr Cys Asn Ala Phe Tyr Gly Phe Thr GlyVal Ala Ala Gly Met 500 505 510 Leu Pro Cys Leu Pro Ile Ala Ala Ser IleThr Lys Ile Gly Arg Asp 515 520 525 Met Leu Leu Ala Thr Ala Gly His IleGlu Asp Arg Cys Asn Arg Pro 530 535 540 Asp Phe Leu Arg Thr Val Leu GlyLeu Pro Pro Glu Ala Ile Asp Pro 545 550 555 560 Glu Ala Leu Arg Val LysIle Ile Tyr Gly Asp Thr Asp Ser Val Phe 565 570 575 Ala Ala Phe Tyr GlyIle Asp Lys Glu Ala Leu Leu Lys Ala Val Gly 580 585 590 Ala Leu Ala AlaAsn Val Thr Asn Ala Leu Phe Lys Glu Pro Val Arg 595 600 605 Leu Glu PheGlu Lys Met Phe Val Ser Leu Met Met Ile Cys Lys Lys 610 615 620 Arg TyrIle Gly Lys Val His Gly Ser Gln Asn Leu Ser Met Lys Gly 625 630 635 640Val Asp Leu Val Arg Arg Thr Ala Cys Gly Phe Val Lys Ala Val Val 645 650655 Ser Asp Val Leu His Met Val Phe Asn Asp Glu Thr Val Ser Glu Gly 660665 670 Thr Met Lys Leu Ser Arg Met Thr Phe Asp Asp Leu Lys Lys Asn Gly675 680 685 Ile Pro Cys Glu Phe Gly Pro Val Val Ser Arg Leu Cys Arg AlaArg 690 695 700 Asp Asp Leu His Leu Lys Lys Val Pro Val Pro Glu Leu ThrLeu Ser 705 710 715 720 Ser Val Leu Ser Gln Glu Leu Ser Cys Tyr Lys GlnLys Asn Leu Pro 725 730 735 His Leu Ala Val Ile Arg Arg Leu Ala Ala ArgLys Glu Glu Leu Pro 740 745 750 Ala Val Gly Asp Arg Val Glu Tyr Val LeuThr Leu Pro Asp Gly Cys 755 760 765 Lys Lys Asn Val Pro Asn Tyr Glu IleAla Glu Asp Pro Arg His Val 770 775 780 Val Glu Ala Lys Leu Ser Ile AsnAla Glu Lys Tyr Tyr Glu Gln Val 785 790 795 800 Val Lys Ala Val Thr AsnThr Leu Met Pro Val Phe Pro Arg 805 810 22 771 PRT Herpes simplex virustype 6/strain Uganda-1102 22 Gly Phe Val Val Tyr Glu Ile Asp Val Asp ValLeu Thr Arg Phe Phe 1 5 10 15 Val Asp Asn Gly Phe Leu Ser Phe Gly TrpTyr Asn Val Lys Lys Tyr 20 25 30 Ile Pro Gln Asp Met Gly Lys Gly Ser AsnLeu Glu Val Glu Ile Asn 35 40 45 Cys His Val Ser Asp Leu Val Ser Leu GluAsp Val Asn Trp Pro Leu 50 55 60 Tyr Gly Cys Trp Ser Phe Asp Ile Glu CysLeu Gly Gln Asn Gly Asn 65 70 75 80 Phe Pro Asp Ala Glu Asn Leu Gly AspIle Val Ile Gln Ile Ser Val 85 90 95 Ile Ser Phe Asp Thr Glu Gly Asp ArgAsp Glu Arg His Leu Phe Thr 100 105 110 Leu Gly Thr Cys Glu Lys Ile AspGly Val His Ile Tyr Glu Phe Ala 115 120 125 Ser Glu Phe Glu Leu Leu LeuGly Phe Phe Ile Phe Leu Arg Ile Glu 130 135 140 Ser Pro Glu Phe Ile ThrGly Tyr Asn Ile Asn Asn Phe Asp Leu Lys 145 150 155 160 Tyr Leu Cys IleArg Met Asp Lys Ile Tyr His Tyr Asp Ile Gly Cys 165 170 175 Phe Ser LysLeu Lys Asn Gly Lys Ile Gly Ile Ser Val Pro His Glu 180 185 190 Gln TyrArg Lys Gly Phe Leu Gln Ala Gln Thr Lys Val Phe Thr Ser 195 200 205 GlyVal Leu Tyr Leu Asp Met Tyr Pro Val Tyr Ser Ser Lys Ile Thr 210 215 220Ala Gln Asn Tyr Lys Leu Asp Thr Ile Ala Lys Ile Cys Leu Gln Gln 225 230235 240 Glu Lys Glu Gln Leu Ser Tyr Lys Glu Ile Pro Lys Lys Phe Ile Ser245 250 255 Gly Pro Ser Gly Arg Ala Val Val Gly Lys Tyr Cys Leu Gln AspSer 260 265 270 Val Leu Val Val Arg Leu Phe Lys Gln Ile Asn Tyr His PheGlu Val 275 280 285 Ala Glu Val Ala Arg Leu Ala His Val Thr Ala Arg CysVal Val Phe 290 295 300 Glu Gly Gln Gln Lys Lys Ile Phe Pro Cys Ile LeuThr Glu Ala Lys 305 310 315 320 Arg Arg Asn Met Ile Leu Pro Ser Met ValSer Ser His Asn Arg Gln 325 330 335 Gly Ile Gly Tyr Lys Gly Ala Thr ValLeu Glu Pro Lys Thr Gly Tyr 340 345 350 Tyr Ala Val Pro Thr Val Val PheAsp Phe Gln Ser Leu Tyr Pro Ser 355 360 365 Ile Met Met Ala His Asn LeuCys Tyr Ser Thr Leu Val Leu Asp Glu 370 375 380 Arg Gln Ile Ala Gly LeuSer Glu Ser Asp Ile Leu Thr Val Lys Leu 385 390 395 400 Gly Asp Glu ThrHis Arg Phe Val Lys Pro Cys Ile Arg Glu Ser Val 405 410 415 Leu Gly SerLeu Leu Lys Asp Trp Leu Ala Lys Arg Arg Glu Val Lys 420 425 430 Ala GluMet Gln Asn Cys Ser Asp Pro Met Met Lys Leu Leu Leu Asp 435 440 445 LysLys Gln Leu Ala Leu Lys Thr Thr Cys Asn Ser Val Tyr Gly Val 450 455 460Thr Gly Ala Ala His Gly Leu Leu Pro Cys Val Ala Ile Ala Ala Ser 465 470475 480 Val Thr Cys Leu Gly Arg Glu Met Leu Cys Ser Thr Val Asp Tyr Val485 490 495 Asn Ser Lys Met Gln Ser Glu Gln Phe Phe Cys Glu Glu Phe GlyLeu 500 505 510 Thr Ser Ser Asp Phe Thr Gly Asp Leu Glu Val Glu Val IleTyr Gly 515 520 525 Asp Thr Asp Ser Ile Phe Met Ser Val Arg Asn Met ValAsn Gln Ser 530 535 540 Leu Arg Arg Ile Ala Pro Met Ile Ala Lys His IleThr Asp Arg Leu 545 550 555 560 Phe Lys Ser Pro Ile Lys Leu Glu Phe GluLys Ile Leu Cys Pro Leu 565 570 575 Ile Leu Ile Cys Lys Lys Arg Tyr IleGly Arg Gln Asp Asp Ser Leu 580 585 590 Leu Ile Phe Lys Gly Val Asp LeuVal Arg Lys Thr Ser Cys Asp Phe 595 600 605 Val Lys Gly Val Val Lys AspIle Val Asp Leu Leu Phe Phe Asp Glu 610 615 620 Glu Val Gln Thr Ala AlaVal Glu Phe Ser His Met Thr Gln Thr Gln 625 630 635 640 Leu Arg Glu GlnGly Val Pro Val Gly Ile His Lys Ile Leu Arg Arg 645 650 655 Leu Cys GluAla Arg Glu Glu Leu Phe Gln Asn Arg Ala Asp Val Arg 660 665 670 His LeuMet Leu Ser Ser Val Leu Ser Lys Glu Met Ala Ala Tyr Lys 675 680 685 GlnPro Asn Leu Ala His Leu Ser Val Ile Arg Arg Leu Ala Gln Arg 690 695 700Lys Glu Glu Ile Pro Asn Val Gly Asp Arg Ile Met Tyr Val Leu Ile 705 710715 720 Ala Pro Ser Ile Gly Asn Lys Gln Thr His Asn Tyr Glu Leu Ala Glu725 730 735 Asp Pro Asn Tyr Val Ile Glu His Lys Ile Pro Ile His Ala GluLys 740 745 750 Tyr Phe Asp Gln Ile Ile Lys Ala Val Thr Asn Ala Ile SerPro Ile 755 760 765 Phe Pro Lys 770 23 757 PRT Homo sapiens 23 Ser HisVal Phe Gly Thr Asn Thr Ser Ser Leu Glu Leu Phe Leu Met 1 5 10 15 AsnArg Lys Ile Lys Gly Pro Cys Trp Leu Glu Val Lys Lys Ser Thr 20 25 30 AlaLeu Asn Gln Pro Val Ser Trp Cys Lys Val Glu Ala Met Ala Leu 35 40 45 LysPro Asp Leu Val Asn Val Ile Lys Asp Val Ser Pro Pro Pro Leu 50 55 60 ValVal Met Ala Phe Ser Met Lys Thr Met Gln Asn Ala Lys Asn His 65 70 75 80Gln Asn Glu Ile Ile Ala Met Ala Ala Leu Val His His Ser Phe Ala 85 90 95Leu Asp Lys Ala Ala Pro Lys Pro Pro Phe Gln Ser His Phe Cys Val 100 105110 Val Ser Lys Pro Lys Asp Cys Ile Phe Pro Tyr Ala Phe Lys Glu Val 115120 125 Ile Glu Lys Lys Asn Val Lys Val Glu Val Ala Ala Thr Glu Arg Thr130 135 140 Leu Leu Gly Phe Phe Leu Ala Lys Val His Lys Ile Asp Pro AspIle 145 150 155 160 Ile Val Gly His Asn Ile Tyr Gly Phe Glu Leu Glu ValLeu Leu Gln 165 170 175 Arg Ile Asn Val Cys Lys Ala Pro His Trp Ser LysIle Gly Arg Leu 180 185 190 Lys Arg Ser Asn Met Pro Lys Leu Gly Gly ArgSer Gly Phe Gly Glu 195 200 205 Arg Asn Ala Thr Cys Gly Arg Met Ile CysAsp Val Glu Ile Ser Ala 210 215 220 Lys Glu Leu Ile Arg Cys Lys Ser TyrHis Leu Ser Glu Leu Val Gln 225 230 235 240 Gln Ile Leu Lys Thr Glu ArgVal Val Ile Pro Met Glu Asn Ile Gln 245 250 255 Asn Met Tyr Ser Glu SerSer Gln Leu Leu Tyr Leu Leu Glu His Thr 260 265 270 Trp Lys Asp Ala LysPhe Ile Leu Gln Ile Met Cys Glu Leu Asn Val 275 280 285 Leu Pro Leu AlaLeu Gln Ile Thr Asn Ile Ala Gly Asn Ile Met Ser 290 295 300 Arg Thr LeuMet Gly Gly Arg Ser Glu Arg Asn Glu Phe Leu Leu Leu 305 310 315 320 HisAla Phe Tyr Glu Asn Asn Tyr Ile Val Pro Asp Lys Gln Ile Phe 325 330 335Arg Lys Pro Gln Gln Lys Leu Gly Asp Glu Asp Glu Glu Ile Asp Gly 340 345350 Asp Thr Asn Lys Tyr Lys Lys Gly Arg Lys Lys Gly Ala Tyr Ala Gly 355360 365 Gly Leu Val Leu Asp Pro Lys Val Gly Phe Tyr Asp Lys Phe Ile Leu370 375 380 Leu Leu Asp Phe Asn Ser Leu Tyr Pro Ser Ile Ile Gln Glu PheAsn 385 390 395 400 Ile Cys Phe Thr Thr Val Gln Arg Val Ala Ser Glu AlaGln Lys Val 405 410 415 Thr Glu Asp Gly Glu Gln Glu Gln Ile Pro Glu LeuPro Asp Pro Ser 420 425 430 Leu Glu Met Gly Ile Leu Pro Arg Glu Ile ArgLys Leu Val Glu Arg 435 440 445 Arg Lys Gln Val Lys Gln Leu Met Lys GlnGln Asp Leu Asn Pro Asp 450 455 460 Leu Ile Leu Gln Tyr Asp Ile Arg GlnLys Ala Leu Lys Leu Thr Ala 465 470 475 480 Asn Ser Met Tyr Gly Cys LeuGly Phe Ser Tyr Ser Arg Phe Tyr Ala 485 490 495 Lys Pro Leu Ala Ala LeuVal Thr Tyr Lys Gly Arg Glu Ile Leu Met 500 505 510 His Thr Lys Glu MetVal Gln Lys Met Asn Leu Glu Val Ile Tyr Gly 515 520 525 Asp Thr Asp SerIle Met Ile Asn Thr Asn Ser Thr Asn Leu Glu Glu 530 535 540 Val Phe LysLeu Gly Asn Lys Val Lys Ser Glu Val Asn Lys Leu Tyr 545 550 555 560 LysLeu Leu Glu Ile Asp Ile Asp Gly Val Phe Lys Ser Leu Leu Leu 565 570 575Leu Lys Lys Lys Lys Tyr Ala Ala Leu Val Val Glu Pro Thr Ser Asp 580 585590 Gly Asn Tyr Val Thr Lys Gln Glu Leu Lys Gly Leu Asp Ile Val Arg 595600 605 Arg Asp Trp Cys Asp Leu Ala Lys Asp Thr Gly Asn Phe Val Ile Gly610 615 620 Gln Ile Leu Ser Asp Gln Ser Arg Asp Thr Ile Val Glu Asn IleGln 625 630 635 640 Lys Arg Leu Ile Glu Ile Gly Glu Asn Val Leu Asn GlySer Val Pro 645 650 655 Val Ser Gln Phe Glu Ile Asn Lys Ala Leu Thr LysAsp Pro Gln Asp 660 665 670 Tyr Pro Asp Lys Lys Ser Leu Pro His Val HisVal Ala Leu Trp Ile 675 680 685 Asn Ser Gln Gly Gly Arg Lys Val Lys AlaGly Asp Thr Val Ser Tyr 690 695 700 Val Ile Cys Gln Asp Gly Ser Asn LeuThr Ala Ser Gln Arg Ala Tyr 705 710 715 720 Ala Pro Glu Gln Leu Gln LysGln Asp Asn Leu Thr Ile Asp Thr Gln 725 730 735 Tyr Tyr Leu Ala Gln GlnIle His Pro Val Val Ala Arg Ile Cys Glu 740 745 750 Pro Ile Asp Gly Ile755 24 757 PRT Mus musculus 24 Ser His Val Phe Gly Thr Asn Thr Ser SerLeu Glu Leu Phe Leu Met 1 5 10 15 Asn Arg Lys Ile Lys Gly Pro Cys TrpLeu Glu Val Lys Asn Pro Gln 20 25 30 Leu Leu Asn Gln Pro Ile Ser Trp CysLys Phe Glu Val Met Ala Leu 35 40 45 Lys Pro Asp Leu Val Asn Val Ile LysAsp Val Ser Pro Pro Pro Leu 50 55 60 Val Val Met Ser Phe Ser Met Lys ThrMet Gln Asn Val Gln Asn His 65 70 75 80 Gln His Glu Ile Ile Ala Met AlaAla Leu Val His His Ser Phe Ala 85 90 95 Leu Asp Lys Ala Pro Pro Glu ProPro Phe Gln Thr His Phe Cys Val 100 105 110 Val Ser Lys Pro Lys Asp CysIle Phe Pro Cys Asp Phe Lys Glu Val 115 120 125 Ile Ser Lys Lys Asn MetLys Val Glu Ile Ala Ala Thr Glu Arg Thr 130 135 140 Leu Ile Gly Phe PheLeu Ala Lys Val His Lys Ile Asp Pro Asp Ile 145 150 155 160 Leu Val GlyHis Asn Ile Cys Ser Phe Glu Leu Glu Val Leu Leu Gln 165 170 175 Arg IleAsn Glu Cys Lys Val Pro Tyr Trp Ser Lys Ile Gly Arg Leu 180 185 190 ArgArg Ser Asn Met Pro Lys Leu Gly Ser Arg Ser Gly Phe Gly Glu 195 200 205Arg Asn Ala Thr Cys Gly Arg Met Ile Cys Asp Val Glu Ile Ser Ala 210 215220 Lys Glu Leu Ile His Cys Lys Ser Tyr His Leu Ser Glu Leu Val Gln 225230 235 240 Gln Ile Leu Lys Thr Glu Arg Ile Val Ile Pro Thr Glu Asn IleArg 245 250 255 Asn Met Tyr Ser Glu Ser Ser Tyr Leu Leu Tyr Leu Leu GluHis Ile 260 265 270 Trp Lys Asp Ala Arg Phe Ile Leu Gln Ile Met Cys GluLeu Asn Val 275 280 285 Leu Pro Leu Ala Leu Gln Ile Thr Asn Ile Ala GlyAsn Ile Met Ser 290 295 300 Arg Thr Leu Met Gly Gly Arg Ser Glu Arg AsnGlu Phe Leu Leu Leu 305 310 315 320 His Ala Phe Tyr Glu Asn Asn Tyr IleVal Pro Asp Lys Gln Ile Phe 325 330 335 Arg Lys Pro Gln Gln Lys Leu GlyAsp Glu Asp Glu Glu Ile Asp Gly 340 345 350 Asp Thr Asn Lys Tyr Lys LysGly Arg Lys Lys Ala Thr Tyr Ala Gly 355 360 365 Gly Leu Val Leu Asp ProLys Val Gly Phe Tyr Asp Lys Phe Ile Leu 370 375 380 Leu Leu Asp Phe AsnSer Leu Tyr Pro Ser Ile Ile Gln Glu Phe Asn 385 390 395 400 Ile Cys PheThr Thr Val Gln Arg Val Thr Ser Glu Val Gln Lys Ala 405 410 415 Thr GluAsp Glu Glu Gln Glu Gln Ile Pro Glu Leu Pro Asp Pro Asn 420 425 430 LeuGlu Met Gly Ile Leu Pro Arg Glu Ile Arg Lys Leu Val Glu Arg 435 440 445Arg Lys Gln Val Lys Gln Leu Met Lys Gln Gln Asp Leu Asn Pro Asp 450 455460 Leu Val Leu Gln Tyr Asp Ile Arg Gln Lys Ala Leu Lys Leu Thr Ala 465470 475 480 Asn Ser Met Tyr Gly Cys Leu Gly Phe Ser Tyr Ser Arg Phe TyrAla 485 490 495 Lys Pro Leu Ala Ala Leu Val Thr Tyr Lys Gly Arg Glu IleLeu Met 500 505 510 His Thr Lys Asp Met Val Gln Lys Met Asn Leu Glu ValIle Tyr Gly 515 520 525 Asp Thr Asp Ser Ile Met Ile Asn Thr Asn Ser ThrAsn Leu Glu Glu 530 535 540 Val Phe Lys Leu Gly Asn Lys Val Lys Ser GluVal Asn Lys Leu Tyr 545 550 555 560 Lys Leu Leu Glu Ile Asp Ile Asp AlaVal Phe Lys Ser Leu Leu Leu 565 570 575 Leu Lys Lys Lys Lys Tyr Ala AlaLeu Val Val Glu Pro Thr Ser Asp 580 585 590 Gly Asn Tyr Ile Thr Lys GlnGlu Leu Lys Gly Leu Asp Ile Val Arg 595 600 605 Arg Asp Trp Cys Asp LeuAla Lys Asp Thr Gly Asn Phe Val Ile Gly 610 615 620 Gln Ile Leu Ser AspGln Ser Arg Asp Thr Ile Val Glu Asn Ile Gln 625 630 635 640 Lys Arg LeuIle Glu Ile Gly Glu Asn Val Leu Asn Gly Ser Val Pro 645 650 655 Val SerGln Phe Glu Ile Asn Lys Ala Leu Thr Lys Asp Pro Gln Asp 660 665 670 TyrPro Asp Arg Lys Ser Leu Pro His Val His Val Ala Leu Trp Ile 675 680 685Asn Ser Gln Gly Gly Arg Lys Val Lys Ala Gly Asp Thr Val Ser Tyr 690 695700 Val Ile Cys Gln Asp Gly Ser Asn Leu Thr Ala Thr Gln Arg Ala Tyr 705710 715 720 Ala Pro Glu Gln Leu Gln Lys Leu Asp Asn Leu Ala Ile Asp ThrGln 725 730 735 Tyr Tyr Leu Ala Gln Gln Ile His Pro Val Val Ala Arg IleCys Glu 740 745 750 Pro Ile Asp Gly Ile 755 25 748 PRT Drosophilamelanogaster 25 Ala His Ile Phe Gly Ala Thr Thr Asn Ala Leu Glu Arg PheLeu Leu 1 5 10 15 Asp Arg Lys Ile Lys Gly Pro Cys Trp Leu Gln Val ThrGly Phe Lys 20 25 30 Val Ser Pro Thr Pro Met Ser Trp Cys Asn Thr Glu ValThr Leu Thr 35 40 45 Glu Pro Lys Asn Val Glu Leu Val Gln Asp Lys Gly LysPro Ala Pro 50 55 60 Pro Pro Pro Leu Thr Leu Leu Ser Leu Asn Val Arg ThrSer Met Asn 65 70 75 80 Pro Lys Thr Ser Arg Asn Glu Ile Cys Met Ile SerMet Leu Thr His 85 90 95 Asn Arg Phe His Ile Asp Arg Pro Ala Pro Gln ProAla Phe Asn Arg 100 105 110 His Met Cys Ala Leu Thr Arg Pro Ala Val ValSer Trp Pro Leu Asp 115 120 125 Leu Asn Phe Glu Met Ala Lys Tyr Lys SerThr Thr Val His Lys His 130 135 140 Asp Ser Glu Arg Ala Leu Leu Ser TrpPhe Leu Ala Gln Tyr Gln Lys 145 150 155 160 Ile Asp Ala Asp Leu Ile ValThr Phe Asp Ser Met Asp Cys Gln Leu 165 170 175 Asn Val Ile Thr Asp GlnIle Val Ala Leu Lys Ile Pro Gln Trp Ser 180 185 190 Arg Met Gly Arg LeuArg Leu Ser Gln Ser Phe Gly Lys Arg Leu Leu 195 200 205 Glu His Phe ValGly Arg Met Val Cys Asp Val Lys Arg Ser Ala Glu 210 215 220 Glu Cys IleArg Ala Arg Ser Tyr Asp Leu Gln Thr Leu Cys Lys Gln 225 230 235 240 ValLeu Lys Leu Lys Glu Ser Glu Arg Met Glu Val Asn Ala Asp Asp 245 250 255Leu Leu Glu Met Tyr Glu Lys Gly Glu Ser Ile Thr Lys Leu Ile Ser 260 265270 Leu Thr Met Gln Asp Asn Ser Tyr Leu Leu Arg Leu Met Cys Glu Leu 275280 285 Asn Ile Met Pro Leu Ala Leu Gln Ile Thr Asn Ile Cys Gly Asn Thr290 295 300 Met Thr Arg Thr Leu Gln Gly Gly Arg Ser Glu Arg Asn Glu PheLeu 305 310 315 320 Leu Leu His Ala Ser Thr Glu Lys Asn Tyr Ile Val ProAsp Lys Lys 325 330 335 Pro Val Ser Lys Arg Ser Gly Ala Gly Asp Thr AspArg Thr Leu Ser 340 345 350 Gly Ala Asp Ala Thr Met Gln Thr Lys Lys LysAla Ala Tyr Ala Gly 355 360 365 Gly Leu Val Leu Glu Pro Met Arg Gly LeuTyr Glu Lys Tyr Val Leu 370 375 380 Leu Met Asp Leu Asn Ser Leu Tyr ProSer Ile Ile Gln Glu Tyr Asn 385 390 395 400 Ile Cys Phe Asn Pro Val GlnGln Pro Val Asp Ala Asp Glu Leu Pro 405 410 415 Thr Leu Pro Asp Ser LysThr Glu Pro Gly Ile Leu Pro Leu Gln Leu 420 425 430 Lys Arg Leu Val GluSer Arg Lys Glu Val Lys Lys Leu Met Ala Ala 435 440 445 Pro Asp Leu SerPro Glu Leu Gln Met Gln Tyr His Ile Arg Gln Met 450 455 460 Ala Leu LysLeu Thr Ala Asn Ser Met Tyr Gly Cys Leu Gly Phe Ala 465 470 475 480 HisSer Arg Phe Phe Ala Gln His Leu Ala Ala Leu Val Thr His Lys 485 490 495Gly Arg Asp Leu Thr Asn Thr Gln Gln Leu Val Gln Lys Met Asn Tyr 500 505510 Asp Val Val Tyr Gly Asp Thr Asp Ser Leu Met Ile Asn Thr Asn Ile 515520 525 Thr Asp Tyr Asp Gln Val Tyr Lys Ile Gly His Asn Ile Lys Gln Ser530 535 540 Val Asn Lys Leu Tyr Lys Gln Leu Glu Leu Asp Ile Asp Gly ValPhe 545 550 555 560 Gly Cys Leu Leu Leu Leu Lys Lys Lys Lys Tyr Ala AlaIle Lys Leu 565 570 575 Ser Lys Asp Ser Lys Gly Asn Leu Arg Arg Glu GlnGlu His Lys Gly 580 585 590 Leu Asp Ile Val Arg Arg Asp Trp Ser Gln LeuAla Val Met Val Gly 595 600 605 Lys Ala Val Leu Asp Glu Val Leu Ser GluLys Pro Leu Glu Glu Lys 610 615 620 Leu Asp Ala Val His Ala Gln Leu GluLys Ile Lys Thr Gln Ile Ala 625 630 635 640 Glu Gly Val Val Pro Leu ProLeu Phe Val Ile Thr Lys Gln Leu Thr 645 650 655 Arg Thr Pro Gln Asp TyrArg Asn Ser Ala Ser Leu Pro His Val Gln 660 665 670 Val Ala Leu Arg MetAsn Arg Glu Arg Asn Arg Arg Tyr Lys Lys Gly 675 680 685 Asp Met Val AspLeu Cys Asp Cys Leu Asp Gly Thr Thr Asn Ala Ala 690 695 700 Met Gln ArgAla Tyr His Leu Asp Glu Leu Lys Thr Ser Glu Asp Lys 705 710 715 720 LysLeu Gln Leu Asp Thr Asn Tyr Tyr Leu Gly His Gln Ile His Pro 725 730 735Val Val Thr Arg Met Val Glu Val Leu Glu Gly Thr 740 745 26 752 PRTSchizosaccharomyces pombe 26 Ser His Val Phe Gly Thr Asn Thr Ala Leu PheGlu Gln Phe Val Leu 1 5 10 15 Ser Arg Arg Val Met Gly Pro Cys Trp LeuLys Ile Gln Gln Pro Asn 20 25 30 Phe Asp Ala Val Lys Asn Ala Ser Trp CysArg Val Glu Ile Gly Cys 35 40 45 Ser Ser Pro Gln Asn Ile Ser Val Ser PheGlu Lys Asn Glu Ile Thr 50 55 60 Ser Lys Thr Pro Pro Met Thr Val Met SerLeu Ala Phe Arg Thr Leu 65 70 75 80 Ile Asn Lys Glu Gln Asn Lys Gln GluVal Val Met Ile Ser Ala Arg 85 90 95 Ile Phe Glu Asn Val Asp Ile Glu LysGly Leu Pro Ala Asn Asp Met 100 105 110 Pro Ser Tyr Ser Phe Ser Leu IleArg Pro Leu Lys Gln Ile Phe Pro 115 120 125 Asn Gly Phe Glu Lys Leu AlaArg Gln His Lys Ser Ser Ile Phe Cys 130 135 140 Glu Arg Ser Glu Val SerLeu Leu Asn Asn Phe Leu Asn Lys Val Arg 145 150 155 160 Thr Tyr Asp ProAsp Val Tyr Phe Gly His Asp Phe Glu Met Cys Tyr 165 170 175 Ser Val LeuLeu Ser Arg Leu Lys Glu Arg Lys Ile His Asn Trp Ser 180 185 190 Ser IleGly Arg Leu Arg Arg Ser Glu Trp Pro Arg Ser Phe Asn Arg 195 200 205 SerSer Gln Gln Phe Val Glu Lys Gln Ile Ile Ala Gly Arg Leu Met 210 215 220Cys Asp Leu Ser Asn Asp Phe Gly Arg Ser Met Ile Lys Ala Gln Ser 225 230235 240 Trp Ser Leu Ser Glu Ile Val Leu Lys Glu Leu Asp Ile Lys Arg Gln245 250 255 Asp Ile Asn Gln Glu Lys Ala Leu Gln Ser Trp Thr Asp Thr AlaHis 260 265 270 Gly Leu Leu Asp Tyr Leu Val His Cys Glu Ile Asp Thr PhePhe Ile 275 280 285 Ala Ala Val Ala Phe Lys Ile Gln Met Leu Gln Leu SerLys Asn Leu 290 295 300 Thr Asn Ile Ala Gly Asn Ser Trp Ala Arg Thr LeuThr Gly Thr Arg 305 310 315 320 Ala Glu Arg Asn Glu Tyr Ile Leu Leu HisGlu Phe Lys Lys Asn Gly 325 330 335 Tyr Ile Val Pro Asp Lys Gln Gln SerIle Arg Arg His Ala Glu Ala 340 345 350 Phe Gly Ala Glu Asp Gly Leu GlnGlu Glu Ser Leu Gly Lys Lys Lys 355 360 365 Asp Lys Tyr Lys Gly Gly LeuVal Phe Glu Pro Gln Lys Gly Leu Tyr 370 375 380 Glu Thr Cys Ile Leu ValMet Asp Phe Asn Ser Leu Tyr Pro Ser Ile 385 390 395 400 Ile Gln Glu TyrAsn Ile Cys Phe Thr Thr Val Asp Arg Ser Pro Ser 405 410 415 Asn Ser AspSer Asp Asp Gln Ile Pro Asp Thr Pro Ser Ala Ser Ala 420 425 430 Asn GlnGly Ile Phe Pro Arg Leu Ile Ala Asn Leu Val Glu Arg Arg 435 440 445 ArgGln Ile Lys Gly Leu Leu Lys Asp Asn Ser Ala Thr Pro Thr Gln 450 455 460Arg Leu Gln Trp Asp Ile Gln Gln Gln Ala Leu Lys Leu Thr Ala Asn 465 470475 480 Ser Met Tyr Gly Cys Leu Gly Tyr Thr Lys Ser Arg Phe Tyr Ala Arg485 490 495 Pro Leu Ala Val Leu Ile Thr Tyr Lys Gly Arg Glu Ala Leu MetAsn 500 505 510 Thr Lys Glu Leu Ala Asp Gln Met Gly Leu Gln Val Ile TyrGly Asp 515 520 525 Thr Asp Ser Val Met Leu Asn Thr Asn Val Thr Asp LysAsn His Ala 530 535 540 Leu Arg Ile Gly Asn Glu Phe Lys Glu Lys Val AsnGlu Arg Tyr Ser 545 550 555 560 Lys Leu Glu Ile Asp Ile Asp Asn Val TyrGln Arg Met Leu Leu His 565 570 575 Ala Lys Lys Lys Tyr Ala Ala Leu GlnLeu Asp Ser Gln Gly Lys Pro 580 585 590 Asn Leu Asp Val Lys Gly Leu AspMet Lys Arg Arg Glu Phe Cys Thr 595 600 605 Leu Ala Lys Glu Ala Ser LysPhe Cys Leu Asp Gln Ile Leu Ser Gly 610 615 620 Glu Leu Thr Glu Thr ValIle Glu Asn Ile His Ser Tyr Leu Met Asp 625 630 635 640 Phe Ser Glu LysMet Arg Asn Gly Lys Phe Pro Ala Asn Lys Phe Ile 645 650 655 Ile Phe AsnArg Leu Gly Lys Asn Pro Glu Asp Tyr Pro Asn Gly Lys 660 665 670 Thr MetPro Phe Val Gln Val Ala Leu Lys Lys Lys Ala Arg Gly Glu 675 680 685 AsnVal Arg Val Gly Asp Val Ile Pro Phe Ile Ile Ala Gly Ser Asp 690 695 700Ala Asp Gly His Pro Ala Asp Arg Ala Tyr Ser Pro Gln Glu Ile Met 705 710715 720 Asn Thr Asn Ser Thr Leu Val Ile Asp Tyr Asn Tyr Tyr Leu Ser His725 730 735 Gln Ile Leu Pro Pro Ile Glu Arg Val Ile Ala Pro Ile Glu GlyThr 740 745 750 27 761 PRT Saccharomyces cerevisiae 27 Tyr His Val PheGly Gly Asn Ser Asn Ile Phe Glu Ser Phe Val Ile 1 5 10 15 Gln Asn ArgIle Met Gly Pro Cys Trp Leu Asp Ile Lys Gly Ala Asp 20 25 30 Phe Asn SerIle Arg Asn Ala Ser His Cys Ala Val Glu Val Ser Val 35 40 45 Asp Lys ProGln Asn Ile Thr Pro Thr Thr Thr Lys Thr Met Pro Asn 50 55 60 Leu Arg CysLeu Ser Leu Ser Ile Gln Thr Leu Met Asn Pro Lys Glu 65 70 75 80 Asn LysGln Glu Ile Val Ser Ile Thr Leu Ser Ala Tyr Arg Asn Ile 85 90 95 Ser LeuAsp Ser Pro Ile Pro Glu Asn Ile Lys Pro Asp Asp Leu Cys 100 105 110 ThrLeu Val Arg Pro Pro Gln Ser Thr Ser Phe Pro Leu Gly Leu Ala 115 120 125Ala Leu Ala Lys Gln Lys Leu Pro Gly Arg Val Arg Leu Phe Asn Asn 130 135140 Glu Lys Ala Met Leu Ser Cys Phe Cys Ala Met Leu Lys Val Glu Asp 145150 155 160 Pro Asp Val Ile Ile Gly His Arg Leu Gln Asn Val Tyr Leu AspVal 165 170 175 Leu Ala His Arg Met His Asp Leu Asn Ile Pro Thr Phe SerSer Ile 180 185 190 Gly Arg Arg Leu Arg Arg Thr Trp Pro Glu Lys Phe GlyArg Gly Asn 195 200 205 Ser Asn Met Asn His Phe Phe Ile Ser Asp Ile CysSer Gly Arg Leu 210 215 220 Ile Cys Asp Ile Ala Asn Glu Met Gly Gln SerLeu Thr Pro Lys Cys 225 230 235 240 Gln Ser Trp Asp Leu Ser Glu Met TyrGln Val Thr Cys Glu Lys Glu 245 250 255 His Lys Pro Leu Asp Ile Asp TyrGln Asn Pro Gln Tyr Gln Asn Asp 260 265 270 Val Asn Ser Met Thr Met AlaLeu Gln Glu Asn Ile Thr Asn Cys Met 275 280 285 Ile Ser Ala Glu Val SerTyr Arg Ile Gln Leu Leu Thr Leu Thr Lys 290 295 300 Gln Leu Thr Asn LeuAla Gly Asn Ala Trp Ala Gln Thr Leu Gly Gly 305 310 315 320 Thr Arg AlaGly Arg Asn Glu Tyr Ile Leu Leu His Glu Phe Ser Arg 325 330 335 Asn GlyPhe Ile Val Pro Asp Lys Glu Gly Asn Arg Ser Arg Ala Gln 340 345 350 LysGln Arg Gln Asn Glu Glu Asn Ala Asp Ala Pro Val Asn Ser Lys 355 360 365Lys Ala Lys Tyr Gln Gly Gly Leu Val Phe Glu Pro Glu Lys Gly Leu 370 375380 His Lys Asn Tyr Val Leu Val Met Asp Phe Asn Ser Leu Tyr Pro Ser 385390 395 400 Ile Ile Gln Glu Phe Asn Ile Cys Phe Thr Thr Val Asp Arg AsnLys 405 410 415 Glu Asp Ile Asp Glu Leu Pro Ser Val Pro Pro Ser Glu ValAsp Gln 420 425 430 Gly Val Leu Pro Arg Leu Leu Ala Asn Leu Val Asp ArgArg Arg Glu 435 440 445 Val Lys Lys Val Met Lys Thr Glu Thr Asp Pro HisLys Arg Val Gln 450 455 460 Cys Asp Ile Arg Gln Gln Ala Leu Lys Leu ThrAla Asn Ser Met Tyr 465 470 475 480 Gly Cys Leu Gly Tyr Val Asn Ser ArgPhe Tyr Ala Lys Pro Leu Ala 485 490 495 Met Leu Val Thr Asn Lys Gly ArgGlu Ile Leu Met Asn Thr Arg Gln 500 505 510 Leu Ala Glu Ser Met Asn LeuLeu Val Val Tyr Gly Asp Thr Asp Ser 515 520 525 Val Met Ile Asp Thr GlyCys Asp Asn Tyr Ala Asp Ala Ile Lys Ile 530 535 540 Gly Leu Gly Phe LysArg Leu Val Asn Glu Arg Tyr Arg Leu Leu Glu 545 550 555 560 Ile Asp IleAsp Asn Val Phe Lys Lys Leu Leu Leu His Ala Lys Lys 565 570 575 Lys TyrAla Ala Leu Thr Val Asn Leu Asp Lys Asn Gly Asn Gly Thr 580 585 590 ThrVal Leu Glu Val Lys Gly Leu Asp Met Lys Arg Arg Glu Phe Cys 595 600 605Pro Leu Ser Arg Asp Val Ser Ile His Val Leu Asn Thr Ile Leu Ser 610 615620 Asp Lys Asp Pro Glu Glu Ala Leu Gln Glu Val Tyr Asp Tyr Leu Glu 625630 635 640 Asp Ile Arg Ile Lys Val Glu Thr Asn Asn Ile Arg Ile Asp LysTyr 645 650 655 Lys Ile Asn Met Lys Leu Ser Lys Asp Pro Lys Ala Tyr ProGly Gly 660 665 670 Lys Asn Met Pro Ala Val Gln Val Ala Leu Arg Met ArgLys Ala Gly 675 680 685 Arg Val Val Lys Ala Gly Ser Val Ile Thr Phe ValIle Thr Lys Gln 690 695 700 Asp Glu Ile Asp Asn Ala Ala Asp Thr Pro AlaLeu Ser Val Ala Glu 705 710 715 720 Arg Ala His Ala Leu Asn Glu Val MetIle Lys Ser Asn Asn Leu Ile 725 730 735 Pro Asp Pro Gln Tyr Tyr Leu GluLys Gln Ile Phe Ala Pro Val Glu 740 745 750 Arg Leu Leu Glu Arg Ile AspSer Phe 755 760 28 761 PRT Trypanosoma brucei 28 Gln Val Val Val Gly AlaSer Arg Ser Leu Leu Glu Leu Phe Leu Ile 1 5 10 15 Lys Lys Arg Leu MetGly Pro Ser Tyr Leu Glu Ile Glu His Leu Val 20 25 30 Thr Ala Met Asp ArgVal Ser His Cys Lys Thr Glu Phe Leu Val Pro 35 40 45 Ser Pro Lys Asp IleLys Val Tyr Asn Ser Ser Lys Pro Pro Pro Pro 50 55 60 Phe Thr Val Ala SerIle Gln Leu His Ala Gln Leu Asp Ser Asp Gly 65 70 75 80 Val Lys Asn GluVal Ile Ala Ala Ser Ile Ala Leu Tyr Gly Asp Val 85 90 95 Ser Ile Asp GlyGlu Arg Lys Pro Asn Ile Thr Glu Cys Phe Thr Gly 100 105 110 Val Arg GlnLeu Ser Pro Asp Ala Pro Leu Pro Leu Asp Leu Glu Thr 115 120 125 Tyr CysLeu Ser Lys Arg Met Pro Gly Val His Arg Phe Ile Asn Glu 130 135 140 ArgAla Leu Leu Thr Trp Phe Ala Glu Thr Leu Ala Ala Leu Asp Pro 145 150 155160 Asp Ile Ile Val Gly His Asn Ile Ile Gly Tyr Thr Val Glu Thr Leu 165170 175 Leu Asn Arg Tyr Gln Glu Leu Asn Ile Val Arg Trp Ser Thr Ile Gly180 185 190 Arg Leu Asp Val Arg Arg Phe Pro Arg Ile Gln Gly Asn Asn PheAsn 195 200 205 Leu Ala Ile Glu Lys Glu Ala Cys Val Gly Arg Leu Val ValAsp Thr 210 215 220 Tyr Leu Leu Ala Arg Glu Tyr Tyr Lys Ser Thr Asn TyrLys Leu Leu 225 230 235 240 Ser Leu Ser Thr Gln Met Glu Ile Lys Gly IleThr Asp Asn Arg Gly 245 250 255 His Phe Glu Pro Gly Ser Thr Val Leu ValLys Asp Ser Met Met Ser 260 265 270 Ser Glu Ala Leu Cys Pro Ile Leu LeuGln Leu Leu Asn Cys Ala Val 275 280 285 Leu Ser Phe Asn Val Ala Ser PheLeu Asp Val Ile Pro Leu Thr Lys 290 295 300 Arg Leu Thr Leu Leu Ala GlyAsn Leu Trp Ser Arg Thr Leu Tyr Gly 305 310 315 320 Ala Arg Ser Glu ArgIle Glu Tyr Leu Leu Leu His Ala Phe His Asn 325 330 335 Leu Lys Phe ValThr Pro Asp Lys Lys Lys Arg Asp Leu Lys Arg Gly 340 345 350 Arg Glu AspAsp Asp Asp Glu Gly Lys Arg Lys Thr Lys Tyr Gln Gly 355 360 365 Gly MetVal Leu Glu Pro Lys Ser Gly Leu Tyr Ser Glu Tyr Ile Leu 370 375 380 LeuLeu Asp Phe Asn Ser Leu Tyr Pro Ser Leu Ile Gln Glu Phe Asn 385 390 395400 Val Cys Tyr Thr Thr Ile Asp Arg Asp Glu Asn Thr Val Ser Ala Glu 405410 415 Val Pro Pro Pro Glu Ser Leu Ile Cys Leu Ser Cys Arg Ala Ala Gly420 425 430 Leu Pro Ser Pro Cys Leu His Lys Cys Ile Leu Pro Lys Val IleArg 435 440 445 Gly Leu Val Asp Ser Arg Arg Glu Ile Lys Arg Met Met LysSer Glu 450 455 460 Lys Asp Pro Gly Asn Leu Ala Met Leu Glu Ile Arg GlnLeu Ala Leu 465 470 475 480 Lys Leu Thr Ala Asn Ser Met Tyr Gly Cys LeuGly Phe Glu Tyr Ser 485 490 495 Arg Phe Tyr Ala Gln Pro Leu Ala Glu LeuVal Thr Arg Gln Gly Arg 500 505 510 Leu Ala Leu Gln Asn Thr Val Glu LeuIle Pro Gln Ile Ser Pro Ser 515 520 525 Ile Arg Val Ile Tyr Gly Asp ThrAsp Ser Val Met Ile Gln Thr Gly 530 535 540 Ile Lys Asp Asp Ile Val LysVal Arg Asn Leu Gly Phe Glu Ile Lys 545 550 555 560 Gly Lys Val Asn GlnArg Tyr Gln Ser Leu Glu Leu Asp Ile Asp Gly 565 570 575 Val Phe Arg AlaMet Leu Leu Leu Arg Lys Lys Lys Tyr Ala Ala Leu 580 585 590 Ser Val ValAsp Trp Gln Gly Glu Gly Lys Val Tyr Lys Arg Glu Val 595 600 605 Lys GlyLeu Asp Met Val Arg Arg Asp Trp Cys Pro Leu Ser Gln His 610 615 620 ValSer Asp Ala Val Leu Lys Arg Ile Leu Asn Ala Glu Gly Gly Glu 625 630 635640 Asp Ile Leu Asp Phe Val Ile Lys Tyr Met Lys Gly Val Ala Gln Asp 645650 655 Val Arg Ser Gly Asn Val Tyr Pro Leu Glu Glu Phe Val Ile Ser Lys660 665 670 Ser Leu Thr Lys Glu Pro Glu Ser Tyr His Gly Thr Gly Tyr ProHis 675 680 685 Ala Val Val Ala Leu Arg Met Lys Gln Arg Lys Glu Gly ValArg Val 690 695 700 Gly Asp Leu Ile Pro Tyr Val Ile Cys Glu Gly Asp GluHis Ile Asp 705 710 715 720 Asp Lys Ala Tyr His Ile Asp Glu Val Arg ArgSer Asp Gly Leu Ser 725 730 735 Val Asp Val Glu Trp Tyr Leu Ser Ser GlnLeu Tyr Pro Pro Val Met 740 745 750 Arg Leu Cys Glu His Ile Gln Gly Phe755 760 29 782 PRT Autographa californica nucleopolynedrovirus 29 AsnAla Ala Cys Leu Asp Lys Phe Leu His Asn Val Asn Arg Val His 1 5 10 15Met Gln Thr Pro Phe Val Glu Gly Ala Tyr Met Arg Phe Lys Lys Thr 20 25 30Gln Arg Cys Gln Asn Asn Tyr Val Gly Gly Ser Thr Thr Arg Met Phe 35 40 45Asn Leu Gln His Phe Asn Glu Asp Phe Glu Leu Val Asp Glu Met Thr 50 55 60Leu Thr Ser Gly Ile Met Pro Val Leu Ser Cys Tyr Asp Ile Glu Thr 65 70 7580 His Ser Asp Gly His Asn Met Ser Lys Ala Ser Val Asp Cys Ile Met 85 9095 Ser Ile Gly Phe Val Val Tyr Lys Asn Asp Glu Tyr Ala Lys Phe Cys 100105 110 Phe Met Tyr His Lys Leu Pro Thr Gln Ile Pro Glu Thr Tyr Asp Asp115 120 125 Asp Thr Tyr Val Val Met Phe Gln Asn Glu Ile Asp Met Ile ThrAla 130 135 140 Phe Phe Asp Met Ile Lys Ile Thr Asn Pro Asp Val Ile LeuAsp Phe 145 150 155 160 Asn Gly Asp Val Phe Asp Leu Pro Tyr Ile Leu GlyArg Leu Asn Lys 165 170 175 Thr Lys Met Leu Leu Lys Arg Tyr Asp Leu ProAla Ala Ala Pro Thr 180 185 190 Thr Lys Leu Phe Ile Asn Lys Leu Gly AsnLys Val Asp Thr Tyr Tyr 195 200 205 Phe Asn Tyr Tyr Ile His Ile Asp LeuTyr Lys Phe Phe Ser Ser Asp 210 215 220 Ser Asn Gln His Lys Val Glu AsnPhe Gln Leu Asn Thr Ile Ser Ser 225 230 235 240 Tyr Tyr Leu Gly Glu AsnLys Ile Asp Leu Pro Trp Thr Glu Met Val 245 250 255 Lys Met Tyr Asn ThrArg Arg Leu Asp Val Ile Ala Lys Tyr Asn Val 260 265 270 Gln Asp Cys MetLeu Pro Ile Lys Leu Phe Val Lys Leu Lys Met Ala 275 280 285 Asp Ser ValTyr Ser Gln Cys Ile Leu His Arg Leu Cys Thr Asp Asp 290 295 300 Val IleCys Asn Ile Ser His Leu Ile Ser Val Ala Cys Phe Tyr Ala 305 310 315 320Ala Ile Thr Asn Thr Arg Ile Asn Glu Ser Thr Gly Lys Glu Glu Pro 325 330335 Asp Pro Tyr Phe Phe Asn Lys Asn Asp Leu Ser Ile Ile Ser Gly Gln 340345 350 Phe Lys Ala Asp Lys Ala Ala Ala Gly Ile Ser Asn Leu Lys Arg Lys355 360 365 Leu Ile Pro Leu Lys Asn Ile Pro Lys Asp Ala Ile Asn Leu GlyPro 370 375 380 Ala Asn Gln Thr Val Lys Tyr Lys Gly Gly Lys Val Leu LysPro Arg 385 390 395 400 Ala Gly Ile Tyr Lys Asn Ala Phe Ser Leu Asp PheAsn Ser Leu Tyr 405 410 415 Leu Thr Ile Met Ile Ala Ile Cys Ala Cys LeuSer Asn Leu Ile Leu 420 425 430 Cys Glu Asp Gly Asn Val Tyr Leu Asn HisAsn Ser Arg Ala Ile Val 435 440 445 Val Lys Leu Leu Leu Lys Leu Leu SerGlu Arg Cys Lys Phe Lys Lys 450 455 460 Asn Arg Asp Asn Gln Ser Glu SerAla Phe Leu Tyr Asp Leu Tyr Asp 465 470 475 480 Gln Lys Gln Asn Ser ValLys Arg Thr Ala Asn Ser Ile Tyr Gly Tyr 485 490 495 Tyr Gly Ile Phe TyrLys Val Leu Ala Asn Tyr Ile Thr Arg Val Gly 500 505 510 Arg Asn Gln LeuArg Leu Ala Ile Ser Leu Ile Glu Gly Leu Ser Asn 515 520 525 Asp Pro GluIle Leu Glu Lys Phe Asn Leu Gly Ser Ile Thr Phe Lys 530 535 540 Val ValTyr Gly Asp Thr Asp Ser Thr Phe Val Leu Pro Thr Phe Asn 545 550 555 560Tyr Asn Glu Ile Ser Asn Glu Thr Asp Thr Leu Lys Gln Ile Cys Thr 565 570575 His Val Glu Thr Arg Val Asn Asn Ser Phe Thr Asp Gly Tyr Lys Met 580585 590 Ala Phe Glu Asn Leu Met Lys Val Leu Ile Leu Leu Lys Lys Lys Lys595 600 605 Tyr Cys Tyr Leu Asn Ser Glu Asn Lys Ile Val Tyr Lys Gly TrpLeu 610 615 620 Val Lys Lys Asp Met Pro Val Phe Met Arg Ile Ala Phe ArgThr Ala 625 630 635 640 Val Glu Gln Ile Leu Arg His Leu Asp Met Asp LysCys Leu Gln Ser 645 650 655 Leu Gln Thr Ser Phe Tyr Glu Tyr Tyr Asp GluPhe Ala Lys Ser Lys 660 665 670 Ser Leu Thr Asp Tyr Ser Phe Ser Met ThrTyr Asn Asp Asn Pro Gly 675 680 685 Lys Lys Arg Lys Ser Thr Asp Asp AsnGlu Gly Pro Ser Pro Lys Arg 690 695 700 Arg Val Ile Thr Val Ala Arg HisCys Arg Glu Ile Leu Val Asn Lys 705 710 715 720 Gly Thr Asp Phe Val ProGly Asn Gly Asp Arg Ile Pro Tyr Leu Leu 725 730 735 Ile Asp Ile Glu GlyLys Val Thr Glu Lys Ala Tyr Pro Leu Arg Leu 740 745 750 Phe Asp Pro ValLys Met Arg Ile Ser Trp Ile Lys His Met Gly Ile 755 760 765 Leu Cys ThrPhe Met Asn Glu Leu Leu Glu Ile Phe Gly Asp 770 775 780 30 797 PRTLymantria dispar multicapsid nuclear polyhedrosis 30 Asp Lys Asn Cys LeuAsp Gly Tyr Leu Ala Asp Val Asn Arg Val His 1 5 10 15 Met Gln Thr SerLeu Leu Glu Gly Gln Tyr Val Arg Phe Lys Asn Ala 20 25 30 His Ala Cys ArgAsp Tyr Arg Leu Ser His Thr Ala Lys Asp Val His 35 40 45 Glu Phe Glu SerMet Leu Glu Arg Val Gln Val Ser Ala Leu Ser His 50 55 60 Glu Ile Leu ProVal Val Ala Cys Tyr Asp Ile Glu Thr His Ser Asp 65 70 75 80 Gly Gln ArgPhe Ser Ala Pro Asp Ala Asp Phe Ile Ile Ser Ile Ala 85 90 95 Val Val ValArg Arg Asp Ala Ala Asp Thr Arg Ile Cys Leu Phe Tyr 100 105 110 Ser ProAsp Asp Pro Val Asp Leu Ser Ser Ser Ser Ser Ser Pro Pro 115 120 125 AlaAla Pro Asp Thr Ala Ala Val His Phe Arg Ala Glu Arg Asp Met 130 135 140Ile Ala Ala Phe Phe Gln Leu Leu Pro Leu Leu Asn Ala Asp Val Val 145 150155 160 Leu Asp Phe Asn Gly Asp Lys Phe Asp Leu Pro Phe Leu Thr Gly Arg165 170 175 Ala Asn Lys Leu Cys Gly Pro Ala Glu Ala Ala Arg Ala Thr LysIle 180 185 190 Ala Arg Tyr Asp Leu Ser Pro Val Asn Val Val Thr Gln GlnSer Tyr 195 200 205 Asp Lys Phe Ser Asn Lys Leu His Ser His Tyr Leu ThrTyr Tyr Ile 210 215 220 His Ile Asp Leu Tyr Gln Phe Leu Ser Thr Asp SerGlu His Asn Asp 225 230 235 240 Leu Glu Asn Phe Gln Leu Asn Thr Val AlaGlu His Tyr Leu Lys Lys 245 250 255 Ser Lys Val Asp Leu Pro Ile His AspMet Leu Gln Met Tyr Gly Glu 260 265 270 Lys Arg Leu Ser Arg Ile Val GluTyr Asn Val Gln Asp Cys Val Leu 275 280 285 Pro Val Glu Leu Phe Leu LysLeu Glu Ile Ala Asp Tyr Met Tyr Thr 290 295 300 Gln Cys Met Leu Leu TyrLeu Cys Thr Asp Asp Leu Leu Arg Asn Ile 305 310 315 320 Ser His Lys IleThr Val Ala Tyr Phe His Leu Ala Leu Thr Asn Thr 325 330 335 Val Ala ArgArg Pro Asp Pro Thr Pro Asp Pro Tyr Phe Phe Asn Lys 340 345 350 Tyr AspLeu Ser Val Thr Ser Gly Ala Ser Ala Pro Ser Thr Ser Arg 355 360 365 ProAla Asn Ala Ile Asp Leu Ser Gln Leu Lys Arg Thr Pro Val Asp 370 375 380Ala Ala Arg Ile Pro Pro Ser Ala Val Lys Leu Cys Ser Thr Arg Gln 385 390395 400 Ser Cys Thr Tyr Lys Gly Gly Lys Val Leu Ser Pro Lys Pro Gly Phe405 410 415 Asn Arg Trp Val Ala Thr Leu Asp Phe Asn Ala Leu Tyr Pro ThrIle 420 425 430 Met Met Trp Glu Gly Val Cys Met Ser Ser Asn Val Phe IleAla Ser 435 440 445 Asp Gly Asn Val Tyr Leu Asp Lys Asn Val Asn Ala ValAsn Pro Lys 450 455 460 Leu Leu Lys Thr Leu Ser Glu Met Arg Val Arg TyrLys Gly Leu Arg 465 470 475 480 Asp Gln Cys Glu Tyr Asn Ser Phe Tyr TyrLys Leu Tyr Asp Lys Ile 485 490 495 Gln Asn Ala Leu Lys Arg Ile Ala AsnSer Ile Tyr Gly Tyr Tyr Gly 500 505 510 Ile Phe Phe Lys Pro Leu Ala AsnTyr Ile Thr Lys Met Gly Arg Gly 515 520 525 Lys Leu Lys Glu Val Val GlyLys Val Glu Ala Met Ser Asp Asp Pro 530 535 540 Arg Ile Leu Arg Glu PheGly Leu Ser Lys Ile Asn Phe Ser Val Ile 545 550 555 560 Tyr Gly Asp ThrAsp Ser Cys Phe Ile Arg Val Leu Phe Asp Glu Ala 565 570 575 Glu Trp ArgArg Thr Ala Ala Arg Pro Arg Ser Ala Pro Ser Cys Arg 580 585 590 Thr ThrCys Ala Lys Arg Ser Thr Thr Leu Trp Cys Gly Tyr Lys Met 595 600 605 SerLeu Glu Asn Ile Met Leu Ser Leu Ile Leu Leu Lys Lys Lys Lys 610 615 620Tyr Cys Tyr Leu Asn Asn Glu Gln Arg Thr Lys Tyr Lys Gly Trp Leu 625 630635 640 Ile Lys Arg Asp Met Pro Leu Phe Met Arg Lys Ala Phe Arg Ala Thr645 650 655 Val Asp Ser Phe Ser Ala Ala Thr Arg Arg Val Arg Ala Arg ProAla 660 665 670 Arg Arg Glu Met Leu Arg Tyr Tyr Arg Glu Phe Gly Ala ProArg Glu 675 680 685 Asn Leu Val Asp Tyr Cys Phe Ser Met Ser Tyr Asn GluThr Ser Thr 690 695 700 Thr Ala Lys Arg Arg Lys Glu Glu Asp Pro Ala ArgLys Pro Val Ile 705 710 715 720 Thr Ile Ala Lys His Cys Arg Glu Leu LeuAla Asn Pro Gly Val Asp 725 730 735 Phe Leu Pro Gly Asn Gly Asp Arg IleGln Tyr Val Leu Val Asp Val 740 745 750 Lys Glu Lys Ile Thr Gln Lys AlaPhe Pro Leu Lys Leu Phe Asp Pro 755 760 765 Asp Ser Pro Thr Leu Gln IleSer Trp Leu Lys His Met Asn Ile Leu 770 775 780 Cys Thr Phe Met Asn GluLeu Ile Gln Val Phe Gly Asn 785 790 795 31 745 PRT Saccharomycescerevisiae 31 Asn Lys Val Pro Ser Met Gly Asn Lys Lys Thr Glu Ser GlnIle Ser 1 5 10 15 Met His Thr Pro His Ser Lys Phe Leu Tyr Lys Phe AlaSer Asp Val 20 25 30 Ser Gly Lys Gln Lys Arg Lys Lys Ser Ser Val His AspSer Leu Thr 35 40 45 His Leu Thr Leu Glu Ile His Ala Asn Thr Arg Ser AspLys Ile Pro 50 55 60 Asp Pro Ala Ile Asp Glu Val Ser Met Ile Ile Trp CysLeu Glu Glu 65 70 75 80 Glu Thr Phe Pro Leu Asp Leu Asp Ile Ala Tyr GluGly Ile Met Ile 85 90 95 Val His Lys Ala Ser Glu Asp Ser Thr Phe Pro ThrLys Ile Gln His 100 105 110 Cys Ile Asn Glu Ile Pro Val Met Phe Tyr GluSer Glu Phe Glu Met 115 120 125 Phe Glu Ala Leu Thr Asp Leu Val Leu LeuLeu Asp Pro Asp Ile Leu 130 135 140 Ser Gly Phe Glu Ile His Asn Phe SerTrp Gly Tyr Ile Ile Glu Arg 145 150 155 160 Cys Gln Lys Ile His Gln PheAsp Ile Val Arg Glu Leu Ala Arg Val 165 170 175 Lys Cys Gln Ile Lys ThrLys Leu Ser Asp Thr Trp Gly Tyr Ala His 180 185 190 Ser Ser Gly Ile MetIle Thr Gly Arg His Met Ile Asn Ile Trp Arg 195 200 205 Ala Leu Arg SerAsp Val Asn Leu Thr Gln Tyr Thr Ile Glu Ser Ala 210 215 220 Ala Phe AsnIle Leu His Lys Arg Leu Pro His Phe Ser Phe Glu Ser 225 230 235 240 LeuThr Asn Met Trp Asn Ala Lys Lys Ser Thr Thr Glu Leu Lys Thr 245 250 255Val Leu Asn Tyr Trp Leu Ser Arg Ala Gln Ile Asn Ile Gln Leu Leu 260 265270 Arg Lys Gln Asp Tyr Ile Ala Arg Asn Ile Glu Gln Ala Arg Leu Ile 275280 285 Gly Ile Asp Phe His Ser Val Tyr Tyr Arg Gly Ser Gln Phe Lys Val290 295 300 Glu Ser Phe Leu Ile Arg Ile Cys Lys Ser Glu Ser Phe Ile LeuLeu 305 310 315 320 Ser Pro Gly Lys Lys Asp Val Arg Lys Gln Lys Ala LeuGlu Cys Val 325 330 335 Pro Leu Val Met Glu Pro Glu Ser Ala Phe Tyr LysSer Pro Leu Ile 340 345 350 Val Leu Asp Phe Gln Ser Leu Tyr Pro Ser IleMet Ile Gly Tyr Asn 355 360 365 Tyr Cys Tyr Ser Thr Met Ile Gly Arg ValArg Glu Ile Asn Leu Thr 370 375 380 Glu Asn Asn Leu Gly Val Ser Lys PheSer Leu Pro Arg Asn Ile Leu 385 390 395 400 Ala Leu Leu Lys Asn Asp ValThr Ile Ala Pro Asn Gly Val Val Tyr 405 410 415 Ala Lys Thr Ser Val ArgLys Ser Thr Leu Ser Lys Met Leu Thr Asp 420 425 430 Ile Leu Asp Val ArgVal Met Ile Lys Lys Thr Met Asn Glu Ile Gly 435 440 445 Asp Asp Asn ThrThr Leu Lys Arg Leu Leu Asn Asn Lys Gln Leu Ala 450 455 460 Leu Lys LeuLeu Ala Asn Val Thr Tyr Gly Tyr Thr Ser Ala Ser Phe 465 470 475 480 SerGly Arg Met Pro Cys Ser Asp Leu Ala Asp Ser Ile Val Gln Thr 485 490 495Gly Arg Glu Thr Leu Glu Lys Ala Ile Asp Ile Ile Glu Lys Asp Glu 500 505510 Thr Trp Asn Ala Lys Val Val Tyr Gly Asp Thr Asp Ser Leu Phe Val 515520 525 Tyr Leu Pro Gly Lys Thr Ala Ile Glu Ala Phe Ser Ile Gly His Ala530 535 540 Met Ala Glu Arg Val Thr Gln Asn Asn Pro Lys Pro Ile Phe LeuLys 545 550 555 560 Phe Glu Lys Val Tyr His Pro Ser Ile Leu Ile Ser LysLys Arg Tyr 565 570 575 Val Gly Phe Ser Tyr Glu Ser Pro Ser Gln Thr LeuPro Ile Phe Asp 580 585 590 Ala Lys Gly Ile Glu Thr Val Arg Arg Asp GlyIle Pro Ala Gln Gln 595 600 605 Lys Ile Ile Glu Lys Cys Ile Arg Leu LeuPhe Gln Thr Lys Asp Leu 610 615 620 Ser Lys Ile Lys Lys Tyr Leu Gln AsnGlu Phe Phe Lys Ile Gln Ile 625 630 635 640 Gly Lys Val Ser Ala Gln AspPhe Cys Phe Ala Lys Glu Val Lys Leu 645 650 655 Gly Ala Tyr Lys Ser GluLys Thr Ala Pro Ala Gly Ala Val Val Val 660 665 670 Lys Arg Arg Ile AsnGlu Asp His Arg Ala Glu Pro Gln Tyr Lys Glu 675 680 685 Arg Ile Pro TyrLeu Val Val Lys Gly Lys Gln Gly Gln Leu Leu Arg 690 695 700 Glu Arg CysVal Ser Pro Glu Glu Phe Leu Glu Gly Glu Asn Leu Glu 705 710 715 720 LeuAsp Ser Glu Tyr Tyr Ile Asn Lys Ile Leu Ile Pro Pro Leu Asp 725 730 735Arg Leu Phe Asn Leu Ile Gly Ile Asn 740 745 32 727 PRT Pyrococcus woesei32 Phe Lys Ile Glu His Asp Arg Thr Phe Arg Pro Tyr Ile Tyr Ala Leu 1 510 15 Leu Arg Asp Asp Ser Lys Ile Glu Glu Val Lys Lys Ile Thr Gly Glu 2025 30 Arg His Gly Lys Ile Val Arg Ile Val Asp Val Glu Lys Val Glu Lys 3540 45 Lys Phe Leu Gly Lys Pro Ile Thr Val Trp Lys Leu Tyr Leu Glu His 5055 60 Pro Gln Asp Val Pro Thr Ile Arg Glu Lys Val Arg Glu His Pro Ala 6570 75 80 Val Val Asp Ile Phe Glu Tyr Asp Ile Pro Phe Ala Lys Arg Tyr Leu85 90 95 Ile Asp Lys Gly Leu Ile Pro Met Glu Gly Glu Glu Glu Leu Lys Ile100 105 110 Leu Ala Phe Asp Ile Glu Thr Leu Tyr His Glu Gly Glu Glu PheGly 115 120 125 Lys Gly Pro Ile Ile Met Ile Ser Tyr Ala Asp Glu Asn GluAla Lys 130 135 140 Val Ile Thr Trp Lys Asn Ile Asp Leu Pro Tyr Val GluVal Val Ser 145 150 155 160 Ser Glu Arg Glu Met Ile Lys Arg Phe Leu ArgIle Ile Arg Glu Lys 165 170 175 Asp Pro Asp Ile Ile Val Thr Tyr Asn GlyAsp Ser Phe Asp Phe Pro 180 185 190 Tyr Leu Ala Lys Arg Ala Glu Lys LeuGly Ile Lys Leu Thr Ile Gly 195 200 205 Arg Asp Gly Ser Glu Pro Lys MetGln Arg Ile Gly Asp Met Thr Ala 210 215 220 Val Glu Val Lys Gly Arg IleHis Phe Asp Leu Tyr His Val Ile Thr 225 230 235 240 Arg Thr Ile Asn LeuPro Thr Tyr Thr Leu Glu Ala Val Tyr Glu Ala 245 250 255 Ile Phe Gly LysPro Lys Glu Lys Val Tyr Ala Asp Glu Ile Ala Lys 260 265 270 Ala Trp GluSer Gly Glu Asn Leu Glu Arg Val Ala Lys Tyr Ser Met 275 280 285 Glu AspAla Lys Ala Thr Tyr Glu Leu Gly Lys Glu Phe Leu Pro Met 290 295 300 GluIle Gln Leu Ser Arg Leu Val Gly Gln Pro Leu Trp Asp Val Ser 305 310 315320 Arg Ser Ser Thr Gly Asn Leu Val Glu Trp Phe Leu Leu Arg Lys Ala 325330 335 Tyr Glu Arg Asn Glu Val Ala Pro Asn Lys Pro Ser Glu Glu Glu Tyr340 345 350 Gln Arg Arg Leu Arg Glu Ser Tyr Thr Gly Gly Phe Val Lys GluPro 355 360 365 Glu Lys Gly Leu Trp Glu Asn Ile Val Tyr Leu Asp Phe ArgAla Leu 370 375 380 Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser Pro AspThr Leu Asn 385 390 395 400 Leu Glu Gly Cys Lys Asn Tyr Asp Ile Ala ProGln Val Gly His Lys 405 410 415 Phe Cys Lys Asp Ile Pro Gly Phe Ile ProSer Leu Leu Gly His Leu 420 425 430 Leu Glu Glu Arg Gln Lys Ile Lys ThrLys Met Lys Glu Thr Gln Asp 435 440 445 Pro Ile Glu Lys Ile Leu Leu AspTyr Arg Gln Lys Ala Ile Lys Leu 450 455 460 Leu Ala Asn Ser Phe Tyr GlyTyr Tyr Gly Tyr Ala Lys Ala Arg Trp 465 470 475 480 Tyr Cys Lys Glu CysAla Glu Ser Val Thr Ala Trp Gly Arg Lys Tyr 485 490 495 Ile Glu Leu ValTrp Lys Glu Leu Glu Glu Lys Phe Gly Phe Lys Val 500 505 510 Leu Tyr IleAsp Thr Asp Gly Leu Tyr Ala Thr Ile Pro Gly Gly Glu 515 520 525 Ser GluGlu Ile Lys Lys Lys Ala Leu Glu Phe Val Lys Tyr Ile Asn 530 535 540 SerLys Leu Pro Gly Leu Leu Glu Leu Glu Tyr Glu Gly Phe Tyr Lys 545 550 555560 Arg Gly Phe Phe Val Thr Lys Lys Arg Tyr Ala Val Ile Asp Glu Glu 565570 575 Gly Lys Val Ile Thr Arg Gly Leu Glu Ile Val Arg Arg Asp Trp Ser580 585 590 Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu Glu Thr Ile LeuLys 595 600 605 His Gly Asp Val Glu Glu Ala Val Arg Ile Val Lys Glu ValIle Gln 610 615 620 Lys Leu Ala Asn Tyr Glu Ile Pro Pro Glu Lys Leu AlaIle Tyr Glu 625 630 635 640 Gln Ile Thr Arg Pro Leu His Glu Tyr Lys AlaIle Gly Pro His Val 645 650 655 Ala Val Ala Lys Lys Leu Ala Ala Lys GlyVal Lys Ile Lys Pro Gly 660 665 670 Met Val Ile Gly Tyr Ile Val Leu ArgGly Asp Gly Pro Ile Ser Asn 675 680 685 Arg Ala Ile Leu Ala Glu Glu TyrAsp Pro Lys Lys His Lys Tyr Asp 690 695 700 Ala Glu Tyr Tyr Ile Glu AsnGln Val Leu Pro Ala Val Leu Arg Ile 705 710 715 720 Leu Glu Gly Phe GlyTyr Arg 725 33 702 PRT Sulfolobus solfataricus 33 Phe Asn Asn Tyr MetTyr Asp Ile Gly Leu Ile Pro Gly Met Pro Tyr 1 5 10 15 Val Val Lys AsnGly Lys Leu Glu Ser Val Tyr Leu Ser Leu Asp Glu 20 25 30 Lys Asp Val GluGlu Ile Lys Lys Ala Phe Ala Asp Ser Asp Glu Met 35 40 45 Thr Arg Gln MetAla Val Asp Trp Leu Pro Ile Phe Glu Thr Glu Ile 50 55 60 Pro Lys Ile LysArg Val Ala Ile Asp Ile Glu Val Tyr Thr Pro Val 65 70 75 80 Lys Gly ArgIle Pro Asp Ser Gln Lys Ala Glu Phe Pro Ile Ile Ser 85 90 95 Ile Ala LeuAla Gly Ser Asp Gly Leu Lys Lys Val Leu Val Leu Asn 100 105 110 Arg AsnAsp Val Asn Glu Gly Ser Val Lys Leu Asp Gly Ile Ser Val 115 120 125 GluArg Phe Asn Thr Glu Tyr Glu Leu Leu Gly Arg Phe Phe Asp Ile 130 135 140Leu Leu Glu Tyr Pro Ile Val Leu Thr Phe Asn Gly Asp Asp Phe Asp 145 150155 160 Leu Pro Tyr Ile Tyr Phe Arg Ala Leu Lys Leu Gly Tyr Phe Pro Glu165 170 175 Glu Ile Pro Ile Asp Val Ala Gly Lys Asp Glu Ala Lys Tyr LeuAla 180 185 190 Gly Leu His Ile Asp Leu Tyr Lys Phe Phe Phe Asn Lys AlaVal Arg 195 200 205 Asn Tyr Ala Phe Glu Gly Lys Tyr Asn Glu Tyr Asn LeuAsp Ala Val 210 215 220 Ala Lys Ala Leu Leu Gly Thr Ser Lys Val Lys ValAsp Thr Leu Ile 225 230 235 240 Ser Phe Leu Asp Val Glu Lys Leu Ile GluTyr Asn Phe Arg Asp Ala 245 250 255 Glu Ile Thr Leu Gln Leu Thr Thr PheAsn Asn Asp Leu Thr Met Lys 260 265 270 Leu Ile Val Leu Phe Ser Arg IleSer Arg Leu Gly Ile Glu Glu Leu 275 280 285 Thr Arg Thr Glu Ile Ser ThrTrp Val Lys Asn Leu Tyr Tyr Trp Glu 290 295 300 His Arg Lys Arg Asn TrpLeu Ile Pro Leu Lys Glu Glu Ile Leu Ala 305 310 315 320 Lys Ser Ser AsnIle Arg Thr Ser Ala Leu Ile Lys Gly Lys Gly Tyr 325 330 335 Lys Gly AlaVal Val Ile Asp Pro Pro Ala Gly Ile Phe Phe Asn Ile 340 345 350 Thr ValLeu Asp Phe Ala Ser Leu Tyr Pro Ser Ile Ile Arg Thr Trp 355 360 365 AsnLeu Ser Tyr Glu Thr Val Asp Ile Gln Gln Cys Lys Lys Pro Tyr 370 375 380Glu Val Lys Asp Glu Thr Gly Glu Val Leu His Ile Val Cys Met Asp 385 390395 400 Arg Pro Gly Ile Thr Ala Val Ile Thr Gly Leu Leu Arg Asp Phe Arg405 410 415 Val Lys Ile Tyr Lys Lys Lys Ala Lys Asn Pro Asn Asn Ser GluGlu 420 425 430 Gln Lys Leu Leu Tyr Asp Val Val Gln Arg Ala Met Lys ValPhe Ile 435 440 445 Asn Ala Thr Tyr Gly Val Phe Gly Ala Glu Thr Phe ProLeu Tyr Ala 450 455 460 Pro Arg Val Ala Glu Ser Val Thr Ala Leu Gly ArgTyr Val Ile Thr 465 470 475 480 Ser Thr Val Lys Lys Ala Arg Glu Glu GlyLeu Thr Val Leu Tyr Gly 485 490 495 Asp Thr Asp Ser Leu Phe Leu Leu AsnPro Pro Lys Asn Ser Leu Glu 500 505 510 Asn Ile Ile Lys Trp Val Lys ThrThr Phe Asn Leu Asp Leu Glu Val 515 520 525 Asp Lys Thr Tyr Lys Phe ValAla Phe Ser Gly Leu Lys Lys Asn Tyr 530 535 540 Phe Gly Val Tyr Gln AspGly Lys Val Asp Ile Lys Gly Met Leu Val 545 550 555 560 Lys Lys Arg AsnThr Pro Glu Phe Val Lys Lys Val Phe Asn Glu Val 565 570 575 Lys Glu LeuMet Ile Ser Ile Asn Ser Pro Asn Asp Val Lys Glu Ile 580 585 590 Lys ArgLys Ile Val Asp Val Val Lys Gly Ser Tyr Glu Lys Leu Lys 595 600 605 AsnLys Gly Tyr Asn Leu Asp Glu Leu Ala Phe Lys Val Met Leu Ser 610 615 620Lys Pro Leu Asp Ala Tyr Lys Lys Asn Thr Pro Gln His Val Lys Ala 625 630635 640 Ala Leu Gln Leu Arg Pro Phe Gly Val Asn Val Leu Pro Arg Asp Ile645 650 655 Ile Tyr Tyr Val Lys Val Arg Ser Lys Asp Gly Val Lys Pro ValGln 660 665 670 Leu Ala Lys Val Thr Glu Ile Asp Ala Glu Lys Tyr Leu GluAla Leu 675 680 685 Arg Ser Thr Phe Glu Gln Ile Leu Arg Ala Phe Gly ValSer 690 695 700 34 719 PRT Escherichia coli 34 Ala Gln His Ile Leu GlnGly Glu Gln Gly Phe Arg Leu Thr Pro Leu 1 5 10 15 Ala Leu Lys Asp PheHis Arg Gln Pro Val Tyr Gly Leu Tyr Cys Arg 20 25 30 Ala His Arg Gln LeuMet Asn Tyr Glu Lys Arg Leu Arg Glu Gly Gly 35 40 45 Val Thr Val Tyr GluAla Asp Val Arg Pro Pro Glu Arg Tyr Leu Met 50 55 60 Glu Arg Phe Ile ThrSer Pro Val Trp Val Glu Gly Asp Met His Asn 65 70 75 80 Gly Thr Ile ValAsn Ala Arg Leu Lys Pro His Pro Asp Tyr Arg Pro 85 90 95 Pro Leu Lys TrpVal Ser Ile Asp Ile Glu Thr Thr Arg His Gly Glu 100 105 110 Leu Tyr CysIle Gly Leu Glu Gly Cys Gly Gln Arg Ile Val Tyr Met 115 120 125 Leu GlyPro Glu Asn Gly Asp Ala Ser Ser Leu Asp Phe Glu Leu Glu 130 135 140 TyrVal Ala Ser Arg Pro Gln Leu Leu Glu Lys Leu Asn Ala Trp Phe 145 150 155160 Ala Asn Tyr Asp Pro Asp Val Ile Ile Gly Trp Asn Val Val Gln Phe 165170 175 Asp Leu Arg Met Leu Gln Lys His Ala Glu Arg Tyr Arg Leu Pro Leu180 185 190 Arg Leu Gly Arg Asp Asn Ser Glu Leu Glu Trp Arg Glu His GlyPhe 195 200 205 Lys Asn Gly Val Phe Phe Ala Gln Ala Lys Gly Arg Leu IleIle Asp 210 215 220 Gly Ile Glu Ala Leu Lys Ser Ala Phe Trp Asn Phe SerSer Phe Ser 225 230 235 240 Leu Glu Thr Val Ala Gln Glu Leu Leu Gly GluGly Lys Ser Ile Asp 245 250 255 Asn Pro Trp Asp Arg Met Asp Glu Ile AspArg Arg Phe Ala Glu Asp 260 265 270 Lys Pro Ala Leu Ala Thr Tyr Asn LeuLys Asp Cys Glu Leu Val Thr 275 280 285 Gln Ile Phe His Lys Thr Glu IleMet Pro Phe Leu Leu Glu Arg Ala 290 295 300 Thr Val Asn Gly Leu Pro ValAsp Arg His Gly Gly Ser Val Ala Ala 305 310 315 320 Phe Gly His Leu TyrPhe Pro Arg Met His Arg Ala Gly Tyr Val Ala 325 330 335 Pro Asn Leu GlyGlu Val Pro Pro His Ala Ser Pro Gly Gly Tyr Val 340 345 350 Met Asp SerArg Pro Gly Leu Tyr Asp Ser Val Leu Val Leu Asp Tyr 355 360 365 Lys SerLeu Tyr Pro Ser Ile Ile Arg Thr Phe Leu Ile Asp Pro Val 370 375 380 GlyLeu Val Glu Gly Met Ala Gln Pro Asp Pro Glu His Ser Thr Glu 385 390 395400 Gly Phe Leu Asp Ala Trp Phe Ser Arg Glu Lys His Cys Leu Pro Glu 405410 415 Ile Val Thr Asn Ile Trp His Gly Arg Asp Glu Ala Lys Arg Gln Gly420 425 430 Asn Lys Pro Leu Ser Gln Ala Leu Lys Ile Ile Met Asn Ala PheTyr 435 440 445 Gly Val Leu Gly Thr Thr Ala Cys Arg Phe Phe Asp Pro ArgLeu Ala 450 455 460 Ser Ser Ile Thr Met Arg Gly His Gln Ile Met Arg GlnThr Lys Ala 465 470 475 480 Leu Ile Glu Ala Gln Gly Tyr Asp Val Ile TyrGly Asp Thr Asp Ser 485 490 495 Thr Phe Val Trp Leu Lys Gly Ala His SerGlu Glu Glu Ala Ala Lys 500 505 510 Ile Gly Arg Ala Leu Val Gln His ValAsn Ala Trp Trp Ala Glu Thr 515 520 525 Leu Gln Lys Gln Arg Leu Thr SerAla Leu Glu Leu Glu Tyr Glu Thr 530 535 540 His Phe Cys Arg Phe Leu MetPro Thr Ile Arg Gly Ala Asp Thr Gly 545 550 555 560 Ser Lys Lys Arg TyrAla Gly Leu Ile Gln Glu Gly Asp Lys Gln Arg 565 570 575 Met Val Phe LysGly Leu Glu Thr Val Arg Thr Asp Trp Thr Pro Leu 580 585 590 Ala Gln GlnPhe Gln Gln Glu Leu Tyr Leu Arg Ile Phe Arg Asn Glu 595 600 605 Pro TyrGln Glu Tyr Val Arg Glu Thr Ile Asp Lys Leu Met Ala Gly 610 615 620 GluLeu Asp Ala Arg Leu Val Tyr Arg Lys Arg Leu Arg Arg Pro Leu 625 630 635640 Ser Glu Tyr Gln Arg Asn Val Pro Pro His Val Arg Ala Ala Arg Leu 645650 655 Ala Asp Glu Glu Asn Gln Lys Arg Gly Arg Pro Leu Gln Tyr Gln Asn660 665 670 Arg Gly Thr Ile Lys Tyr Val Trp Thr Thr Asn Gly Pro Glu ProLeu 675 680 685 Asp Tyr Gln Arg Ser Pro Leu Asp Tyr Glu His Tyr Leu ThrArg Gln 690 695 700 Leu Gln Pro Val Ala Glu Gly Ile Leu Pro Phe Ile GluAsp Asn 705 710 715 35 773 PRT Desilforococcus strain Tok 35 Met Ile LeuAsp Ala Asp Tyr Ile Thr Glu Asp Gly Lys Pro Val Ile 1 5 10 15 Arg ValPhe Lys Lys Glu Lys Gly Glu Phe Lys Ile Asp Tyr Asp Arg 20 25 30 Asp PheGlu Pro Tyr Ile Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu AspIle Lys Lys Ile Thr Ala Glu Arg His Gly Thr Thr Val Arg 50 55 60 Val ThrArg Ala Glu Arg Val Lys Lys Lys Phe Leu Gly Arg Pro Val 65 70 75 80 GluVal Trp Lys Leu Tyr Phe Thr His Pro Gln Asp Val Pro Ala Ile 85 90 95 ArgAsp Lys Ile Arg Glu His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Arg Gly Leu Ile Pro 115 120125 Met Glu Gly Asp Glu Glu Leu Arg Met Leu Ala Phe Asp Ile Glu Thr 130135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Glu Gly Pro Ile Leu Met Ile145 150 155 160 Ser Tyr Ala Asp Glu Glu Gly Ala Arg Val Ile Thr Trp LysAsn Ile 165 170 175 Asp Leu Pro Tyr Val Glu Ser Val Ser Thr Glu Lys GluMet Ile Lys 180 185 190 Arg Phe Leu Lys Val Ile Gln Glu Lys Asp Pro AspVal Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr LeuLys Lys Arg Ser Glu 210 215 220 Met Leu Gly Val Lys Phe Ile Leu Gly ArgAsp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg PheAla Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro ValIle Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Thr ValTyr Glu Pro Val Phe Gly Gln Pro Lys Glu 275 280 285 Lys Val Tyr Ala GluGlu Ile Ala Arg Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg ValAla Arg Tyr Ser Met Glu Asp Ala Lys Ala Thr Tyr 305 310 315 320 Glu LeuGly Lys Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 ValGly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Glu Arg Asn Asp Val Ala 355 360365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Thr Glu Ser Tyr 370375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Lys Gly Leu Trp Glu Asn Ile385 390 395 400 Val Tyr Leu Asp Tyr Lys Ser Leu Tyr Pro Ser Ile Ile IleThr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys ArgGlu Tyr Asp 420 425 430 Val Ala Pro Gln Val Gly His Arg Phe Cys Lys AspPhe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu GluArg Gln Lys Val Lys 450 455 460 Lys Lys Met Lys Ala Thr Val Asp Pro IleGlu Arg Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Ala Ile Lys IleLeu Ala Asn Ser Tyr Tyr Gly Tyr 485 490 495 Tyr Ala Tyr Ala Asn Ala ArgTrp Tyr Cys Arg Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly ArgGln Tyr Ile Glu Thr Thr Met Arg Glu Ile 515 520 525 Glu Glu Lys Phe GlyPhe Lys Val Leu Tyr Ala Asp Thr Asp Gly Phe 530 535 540 Phe Ala Thr IlePro Gly Ala Asp Ala Glu Thr Val Lys Asn Lys Ala 545 550 555 560 Lys GluPhe Leu Asn Tyr Ile Asn Pro Arg Leu Pro Gly Leu Leu Glu 565 570 575 LeuGlu Tyr Glu Gly Phe Tyr Arg Arg Gly Phe Phe Val Thr Lys Lys 580 585 590Lys Tyr Ala Val Ile Asp Glu Glu Asp Lys Ile Thr Thr Arg Gly Leu 595 600605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Arg His GluVal Pro 645 650 655 Pro Glu Lys Leu Val Ile Tyr Glu Gln Ile Thr Arg AspLeu Arg Ser 660 665 670 Tyr Arg Ala Thr Gly Pro His Val Ala Val Ala LysArg Leu Ala Ala 675 680 685 Arg Gly Ile Lys Ile Arg Pro Gly Thr Val IleSer Tyr Ile Val Leu 690 695 700 Lys Gly Pro Gly Arg Val Gly Asp Arg AlaIle Pro Phe Asp Glu Phe 705 710 715 720 Asp Pro Ala Lys His Arg Tyr AspAla Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu ArgIle Leu Arg Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr GlnLys Thr Lys Gln Ala Gly Leu Gly Ala Trp 755 760 765 Leu Lys Pro Lys Thr770 36 871 PRT Bacteriophage RM378 36 Met Lys Ile Thr Leu Ser Ala SerVal Tyr Pro Arg Ser Met Lys Ile 1 5 10 15 Tyr Gly Val Glu Leu Ile GluGly Lys Lys His Leu Phe Gln Ser Pro 20 25 30 Val Pro Pro His Leu Lys ArgIle Ala Gln Gln Asn Arg Gly Lys Ile 35 40 45 Glu Ala Glu Ala Ile Ser TyrTyr Ile Arg Glu Gln Lys Ser His Ile 50 55 60 Thr Pro Glu Ala Leu Ser GlnCys Val Phe Ile Asp Ile Glu Thr Ile 65 70 75 80 Ser Pro Lys Lys Ser PhePro Asp Pro Trp Arg Asp Pro Val Tyr Ser 85 90 95 Ile Ser Ile Lys Pro TyrGly Lys Pro Val Val Val Val Leu Leu Leu 100 105 110 Ile Thr Asn Pro GluAla His Ile Asp Asn Phe Asn Lys Phe Thr Thr 115 120 125 Ser Val Gly AspAsn Thr Phe Glu Ile His Tyr Arg Thr Phe Leu Ser 130 135 140 Glu Lys ArgLeu Leu Glu Tyr Phe Trp Asn Val Leu Lys Pro Lys Phe 145 150 155 160 ThrPhe Met Leu Ala Trp Asn Gly Tyr Gln Phe Asp Tyr Pro Tyr Leu 165 170 175Leu Ile Arg Ser His Ile His Glu Val Asn Val Ile Ser Asp Lys Leu 180 185190 Leu Pro Asp Trp Lys Leu Val Arg Lys Ile Ser Asp Arg Asn Leu Pro 195200 205 Phe Tyr Phe Asn Pro Arg Thr Pro Val Glu Phe Val Phe Phe Asp Tyr210 215 220 Met Arg Leu Tyr Arg Ser Phe Val Ala Tyr Lys Glu Leu Glu SerTyr 225 230 235 240 Arg Leu Asp Tyr Ile Ala Arg Glu Glu Ile Gly Glu GlyLys Val Asp 245 250 255 Phe Asp Val Arg Phe Tyr His Glu Ile Pro Val TyrPro Asp Lys Lys 260 265 270 Leu Val Glu Tyr Asn Ala Val Asp Ala Ile LeuMet Glu Glu Ile Glu 275 280 285 Asn Lys Asn His Ile Leu Pro Thr Leu PheGlu Ile Ala Arg Leu Ser 290 295 300 Asn Leu Thr Pro Ala Leu Ala Leu AsnAla Ser Asn Ile Leu Ile Gly 305 310 315 320 Asn Val Thr Gly Lys Leu GlyVal Lys Phe Val Asp Tyr Ile Lys Lys 325 330 335 Ile Asp Thr Ile Asn ThrMet Phe Lys Lys Ile Pro Glu Met Asn Ile 340 345 350 Asn Lys Tyr Arg TyrArg Gly Ala Tyr Ile Glu Leu Thr Asn Pro Asp 355 360 365 Ile Tyr Phe AsnVal Phe Asp Leu Asp Phe Thr Ser Leu Tyr Pro Ser 370 375 380 Val Ile SerLys Phe Asn Ile Asp Pro Ala Thr Phe Val Thr Glu Phe 385 390 395 400 TyrGly Cys Met Arg Val Glu Asn Lys Val Ile Pro Val Asp Gln Glu 405 410 415Glu Pro Glu Phe Gly Phe Pro Leu Tyr Ile Phe Asp Ser Gly Met Asn 420 425430 Pro Ser Tyr Arg Ser Glu Pro Leu Phe Val Ile Asn Ser Phe Glu Glu 435440 445 Leu Arg Gln Phe Leu Lys Ser Arg Asn Ile Ile Met Val Pro Asn Pro450 455 460 Ser Gly Ile Cys Trp Phe Tyr Arg Lys Glu Pro Val Gly Val LeuPro 465 470 475 480 Ser Ile Ile Arg Glu Ile Phe Thr Arg Arg Lys Glu GluArg Lys Leu 485 490 495 Phe Lys Glu Thr Gly Asn Met Glu His His Phe ArgGln Trp Ala Leu 500 505 510 Lys Ile Met Met Asn Ser Met Tyr Gly Ile PheGly Asn Arg Ser Val 515 520 525 Tyr Met Gly Cys Leu Pro Ile Ala Glu SerVal Thr Ala Ala Gly Arg 530 535 540 Met Ser Ile Arg Ser Val Ile Ser GlnIle Arg Asp Arg Phe Ile Tyr 545 550 555 560 Ser His Thr Asp Ser Ile PheVal Lys Ala Phe Thr Asp Asp Pro Val 565 570 575 Ala Glu Ala Gly Glu LeuGln Glu His Leu Asn Ser Phe Ile Asn Asp 580 585 590 Tyr Met Glu Asn AsnPhe Asn Ala Arg Glu Asp Phe Lys Leu Glu Leu 595 600 605 Lys Gln Glu PheVal Phe Lys Ser Ile Leu Ile Lys Glu Ile Asn Arg 610 615 620 Tyr Phe AlaVal Thr Val Asp Gly Lys Glu Glu Met Lys Gly Ile Glu 625 630 635 640 ValIle Asn Ser Ser Val Pro Glu Ile Val Lys Lys Tyr Phe Arg Gly 645 650 655Tyr Leu Lys Tyr Ile Ser Gln Pro Asp Ile Asp Val Ile Ser Ala Thr 660 665670 Ile Ala Phe Tyr Asn Asn Phe Val Ser Gln Lys Asn Phe Trp Ser Ile 675680 685 Glu Asp Leu Tyr His Lys Met Lys Ile Ser Ser Ser Asp Ser Ala Glu690 695 700 Arg Tyr Val Glu Tyr Val Glu Glu Val Met Lys Met Lys Lys GluAsn 705 710 715 720 Val Pro Ile Ser Glu Ile Phe Ile Lys Met Tyr Asp HisThr Leu Pro 725 730 735 Ile His Tyr Lys Gly Ala Leu Phe Ala Ser Ile IleGly Cys Lys Pro 740 745 750 Pro Gln Met Gly Asp Lys Ile Tyr Trp Phe TyrCys Thr Met Leu Asp 755 760 765 Pro Ser Arg Thr Asn Leu Pro Leu Ser LeuGlu Glu Val Asn Pro Glu 770 775 780 His Gly Ser Gly Val Trp Asp Ile LeuLys Ala Gly Lys Lys Thr His 785 790 795 800 Ile Asn Arg Leu Arg Asn IleHis Ala Leu Ser Ile Arg Glu Asp Asp 805 810 815 Glu Glu Gly Leu Glu IleVal Lys Lys Tyr Ile Asp Arg Asp Lys Tyr 820 825 830 Cys Gln Ile Ile SerGlu Lys Thr Ile Asp Leu Leu Lys Ser Leu Gly 835 840 845 Tyr Val Glu AsnThr Thr Lys Ile Lys Thr Val Glu Asp Leu Ile Arg 850 855 860 Phe Leu ValGlu Ser Glu Asn 865 870 37 898 PRT Bacteriophage RB69 37 Met Lys Glu PheTyr Leu Thr Val Glu Gln Ile Gly Asp Ser Ile Phe 1 5 10 15 Glu Arg TyrIle Asp Ser Asn Gly Arg Glu Arg Thr Arg Glu Val Glu 20 25 30 Tyr Lys ProSer Leu Phe Ala His Cys Pro Glu Ser Gln Ala Thr Lys 35 40 45 Tyr Phe AspIle Tyr Gly Lys Pro Cys Thr Arg Lys Leu Phe Ala Asn 50 55 60 Met Arg AspAla Ser Gln Trp Ile Lys Arg Met Glu Asp Ile Gly Leu 65 70 75 80 Glu AlaLeu Gly Met Asp Asp Phe Lys Leu Ala Tyr Leu Ser Asp Thr 85 90 95 Tyr AsnTyr Glu Ile Lys Tyr Asp His Thr Lys Ile Arg Val Ala Asn 100 105 110 PheAsp Ile Glu Val Thr Ser Pro Asp Gly Phe Pro Glu Pro Ser Gln 115 120 125Ala Lys His Pro Ile Asp Ala Ile Thr His Tyr Asp Ser Ile Asp Asp 130 135140 Arg Phe Tyr Val Phe Asp Leu Leu Asn Ser Pro Tyr Gly Asn Val Glu 145150 155 160 Glu Trp Ser Ile Glu Ile Ala Ala Lys Leu Gln Glu Gln Gly GlyAsp 165 170 175 Glu Val Pro Ser Glu Ile Ile Asp Lys Ile Ile Tyr Met ProPhe Asp 180 185 190 Asn Glu Lys Glu Leu Leu Met Glu Tyr Leu Asn Phe TrpGln Gln Lys 195 200 205 Thr Pro Val Ile Leu Thr Gly Trp Asn Val Glu SerPhe Asp Ile Pro 210 215 220 Tyr Val Tyr Asn Arg Ile Lys Asn Ile Phe GlyGlu Ser Thr Ala Lys 225 230 235 240 Arg Leu Ser Pro His Arg Lys Thr ArgVal Lys Val Ile Glu Asn Met 245 250 255 Tyr Gly Ser Arg Glu Ile Ile ThrLeu Phe Gly Ile Ser Val Leu Asp 260 265 270 Tyr Ile Asp Leu Tyr Lys LysPhe Ser Phe Thr Asn Gln Pro Ser Tyr 275 280 285 Ser Leu Asp Tyr Ile SerGlu Phe Glu Leu Asn Val Gly Lys Leu Lys 290 295 300 Tyr Asp Gly Pro IleSer Lys Leu Arg Glu Ser Asn His Gln Arg Tyr 305 310 315 320 Ile Ser TyrAsn Ile Ile Asp Val Tyr Arg Val Leu Gln Ile Asp Ala 325 330 335 Lys ArgGln Phe Ile Asn Leu Ser Leu Asp Met Gly Tyr Tyr Ala Lys 340 345 350 IleGln Ile Gln Ser Val Phe Ser Pro Ile Lys Thr Trp Asp Ala Ile 355 360 365Ile Phe Asn Ser Leu Lys Glu Gln Asn Lys Val Ile Pro Gln Gly Arg 370 375380 Ser His Pro Val Gln Pro Tyr Pro Gly Ala Phe Val Lys Glu Pro Ile 385390 395 400 Pro Asn Arg Tyr Lys Tyr Val Met Ser Phe Asp Leu Thr Ser LeuTyr 405 410 415 Pro Ser Ile Ile Arg Gln Val Asn Ile Ser Pro Glu Thr IleAla Gly 420 425 430 Thr Phe Lys Val Ala Pro Leu His Asp Tyr Ile Asn AlaVal Ala Glu 435 440 445 Arg Pro Ser Asp Val Tyr Ser Cys Ser Pro Asn GlyMet Met Tyr Tyr 450 455 460 Lys Asp Arg Asp Gly Val Val Pro Thr Glu IleThr Lys Val Phe Asn 465 470 475 480 Gln Arg Lys Glu His Lys Gly Tyr MetLeu Ala Ala Gln Arg Asn Gly 485 490 495 Glu Ile Ile Lys Glu Ala Leu HisAsn Pro Asn Leu Ser Val Asp Glu 500 505 510 Pro Leu Asp Val Asp Tyr ArgPhe Asp Phe Ser Asp Glu Ile Lys Glu 515 520 525 Lys Ile Lys Lys Leu SerAla Lys Ser Leu Asn Glu Met Leu Phe Arg 530 535 540 Ala Gln Arg Thr GluVal Ala Gly Met Thr Ala Gln Ile Asn Arg Lys 545 550 555 560 Leu Leu IleAsn Ser Leu Tyr Gly Ala Leu Gly Asn Val Trp Phe Arg 565 570 575 Tyr TyrAsp Leu Arg Asn Ala Thr Ala Ile Thr Thr Phe Gly Gln Met 580 585 590 AlaLeu Gln Trp Ile Glu Arg Lys Val Asn Glu Tyr Leu Asn Glu Val 595 600 605Cys Gly Thr Glu Gly Glu Ala Phe Val Leu Tyr Gly Asp Thr Asp Ser 610 615620 Ile Tyr Val Ser Ala Asp Lys Ile Ile Asp Lys Val Gly Glu Ser Lys 625630 635 640 Phe Arg Asp Thr Asn His Trp Val Asp Phe Leu Asp Lys Phe AlaArg 645 650 655 Glu Arg Met Glu Pro Ala Ile Asp Arg Gly Phe Arg Glu MetCys Glu 660 665 670 Tyr Met Asn Asn Lys Gln His Leu Met Phe Met Asp ArgGlu Ala Ile 675 680 685 Ala Gly Pro Pro Leu Gly Ser Lys Gly Ile Gly GlyPhe Trp Thr Gly 690 695 700 Lys Lys Arg Tyr Ala Leu Asn Val Trp Asp MetGlu Gly Thr Arg Tyr 705 710 715 720 Ala Glu Pro Lys Leu Lys Ile Met GlyLeu Glu Thr Gln Lys Ser Ser 725 730 735 Thr Pro Lys Ala Val Gln Lys AlaLeu Lys Glu Cys Ile Arg Arg Met 740 745 750 Leu Gln Glu Gly Glu Glu SerLeu Gln Glu Tyr Phe Lys Glu Phe Glu 755 760 765 Lys Glu Phe Arg Gln LeuAsn Tyr Ile Ser Ile Ala Ser Val Ser Ser 770 775 780 Ala Asn Asn Ile AlaLys Tyr Asp Val Gly Gly Phe Pro Gly Pro Lys 785 790 795 800 Cys Pro PheHis Ile Arg Gly Ile Leu Thr Tyr Asn Arg Ala Ile Lys 805 810 815 Gly AsnIle Asp Ala Pro Gln Val Val Glu Gly Glu Lys Val Tyr Val 820 825 830 LeuPro Leu Arg Glu Gly Asn Pro Phe Gly Asp Lys Cys Ile Ala Trp 835 840 845Pro Ser Gly Thr Glu Ile Thr Asp Leu Ile Lys Asp Asp Val Leu His 850 855860 Trp Met Asp Tyr Thr Val Leu Leu Glu Lys Thr Phe Ile Lys Pro Leu 865870 875 880 Glu Gly Phe Thr Ser Ala Ala Lys Leu Asp Tyr Glu Lys Lys AlaSer 885 890 895 Leu Phe 38 394 PRT Autographa californicanucleopolynedrovirus 38 Met Leu His Val Ser Arg Leu Leu Ala Asn Gly GlyVal Lys Asn Leu 1 5 10 15 Cys Asp Lys Phe Lys Val Lys Ile Lys Asn TyrThr Glu His Asp Leu 20 25 30 Met Val Leu Asn Tyr Glu Ser Phe Glu Arg AspArg Asp His Pro Val 35 40 45 Val Val Glu Cys Arg Gly Leu Ile Leu Asn SerArg Thr Tyr Ala Val 50 55 60 Val Ser Arg Ser Phe Asp Arg Phe Phe Asn PheGln Glu Leu Leu Gln 65 70 75 80 Asn Ile Gly Gly Glu Asp Ala His His LysLeu Phe Gln Ser Lys Glu 85 90 95 Asn Phe Lys Phe Tyr Glu Lys Ile Asp GlySer Leu Ile Lys Ile Tyr 100 105 110 Lys Tyr Asn Gly Glu Trp His Ala SerThr Arg Gly Ser Ala Phe Ala 115 120 125 Glu Asn Leu Cys Val Ser Asp ValThr Phe Lys Arg Leu Val Leu Gln 130 135 140 Ala Leu Gln Leu Asp Glu AlaHis Asn Gln Phe Gln Ala Leu Cys Asn 145 150 155 160 Glu Tyr Leu Asp CysAla Ser Thr His Met Phe Glu Leu Thr Ser Lys 165 170 175 His Asn Arg IleVal Thr Val Tyr Asp Glu Gln Pro Thr Leu Trp Tyr 180 185 190 Leu Ala SerArg Asn Asn Glu Thr Gly Asp Tyr Phe Tyr Cys Ser Asn 195 200 205 Leu ProPhe Cys Lys Tyr Pro Lys Cys Tyr Glu Phe Thr Ser Val Gln 210 215 220 GluCys Val Glu His Ala Ala Gln Leu Lys Asn Leu Glu Glu Gly Phe 225 230 235240 Val Val Tyr Asp Lys Asn Asn Ala Pro Leu Cys Lys Ile Lys Ser Asp 245250 255 Val Tyr Leu Asn Met His Lys Asn Gln Ser Arg Ala Glu Asn Pro Thr260 265 270 Lys Leu Ala Gln Leu Val Ile Asn Gly Glu His Asp Asp Phe LeuAla 275 280 285 Leu Phe Pro His Leu Lys Ser Val Ile Lys Pro Tyr Val AspAla Arg 290 295 300 Asn Thr Phe Thr Asn Glu Ser Thr Ile Asn Ile Met ValSer Gly Leu 305 310 315 320 Thr Leu Asn Gln Gln Arg Phe Asn Glu Leu ValGln Thr Leu Pro Trp 325 330 335 Lys Cys Leu Ala Tyr Arg Cys Arg Lys AlaGln Thr Ile Asp Val Glu 340 345 350 Ser Glu Phe Leu Lys Leu Thr Glu ProGlu Lys Ile Lys Met Ile Lys 355 360 365 Asn Ile Ile Lys Phe Val Ser ThrLys Gln Ala Leu Asn Asn Lys Leu 370 375 380 Ala Pro Thr Ile Lys Leu ProSer Ser Lys 385 390 39 374 PRT Bacteriophage T4 39 Met Gln Glu Leu PheAsn Asn Leu Met Glu Leu Cys Lys Asp Ser Gln 1 5 10 15 Arg Lys Phe PheTyr Ser Asp Asp Val Ser Ala Ser Gly Arg Thr Tyr 20 25 30 Arg Ile Phe SerTyr Asn Tyr Ala Ser Tyr Ser Asp Trp Leu Leu Pro 35 40 45 Asp Ala Leu GluCys Arg Gly Ile Met Phe Glu Met Asp Gly Glu Lys 50 55 60 Pro Val Arg IleAla Ser Arg Pro Met Glu Lys Phe Phe Asn Leu Asn 65 70 75 80 Glu Asn ProPhe Thr Met Asn Ile Asp Leu Asn Asp Val Asp Tyr Ile 85 90 95 Leu Thr LysGlu Asp Gly Ser Leu Val Ser Thr Tyr Leu Asp Gly Asp 100 105 110 Glu IleLeu Phe Lys Ser Lys Gly Ser Ile Lys Ser Glu Gln Ala Leu 115 120 125 MetAla Asn Gly Ile Leu Met Asn Ile Asn His His Arg Leu Arg Asp 130 135 140Arg Leu Lys Glu Leu Ala Glu Asp Gly Phe Thr Ala Asn Phe Glu Phe 145 150155 160 Val Ala Pro Thr Asn Arg Ile Val Leu Ala Tyr Gln Glu Met Lys Ile165 170 175 Ile Leu Leu Asn Val Arg Glu Asn Glu Thr Gly Glu Tyr Ile SerTyr 180 185 190 Asp Asp Ile Tyr Lys Asp Ala Thr Leu Arg Pro Tyr Leu ValGlu Arg 195 200 205 Tyr Glu Ile Asp Ser Pro Lys Trp Ile Glu Glu Ala LysAsn Ala Glu 210 215 220 Asn Ile Glu Gly Tyr Val Ala Val Met Lys Asp GlySer His Phe Lys 225 230 235 240 Ile Lys Ser Asp Trp Tyr Val Ser Leu HisSer Thr Lys Ser Ser Leu 245 250 255 Asp Asn Pro Glu Lys Leu Phe Lys ThrIle Ile Asp Gly Ala Ser Asp 260 265 270 Asp Leu Lys Ala Met Tyr Ala AspAsp Glu Tyr Ser Tyr Arg Lys Ile 275 280 285 Glu Ala Phe Glu Thr Thr TyrLeu Lys Tyr Leu Asp Arg Ala Leu Phe 290 295 300 Leu Val Leu Asp Cys HisAsn Lys His Cys Gly Lys Asp Arg Lys Thr 305 310 315 320 Tyr Ala Met GluAla Gln Gly Val Ala Lys Gly Ala Gly Met Asp His 325 330 335 Leu Phe GlyIle Ile Met Ser Leu Tyr Gln Gly Tyr Asp Ser Gln Glu 340 345 350 Lys ValMet Cys Glu Ile Glu Gln Asn Phe Leu Lys Asn Tyr Lys Lys 355 360 365 PheIle Pro Glu Gly Tyr 370 40 437 PRT Bacteriophage RM378 40 Met Ser MetAsn Val Lys Tyr Pro Val Glu Tyr Leu Ile Glu His Leu 1 5 10 15 Asn SerPhe Glu Ser Pro Glu Val Ala Val Glu Ser Leu Arg Lys Glu 20 25 30 Gly IleMet Cys Lys Asn Arg Gly Asp Leu Tyr Met Phe Lys Tyr His 35 40 45 Leu GlyCys Lys Phe Asp Lys Ile Tyr His Leu Ala Cys Arg Gly Ala 50 55 60 Ile LeuArg Lys Thr Asp Ser Gly Trp Lys Val Leu Ser Tyr Pro Phe 65 70 75 80 AspLys Phe Phe Asn Trp Gly Glu Glu Leu Gln Pro Glu Ile Val Asn 85 90 95 TyrTyr Gln Thr Leu Arg Tyr Ala Ser Pro Leu Asn Glu Lys Arg Lys 100 105 110Ala Gly Phe Met Phe Lys Leu Pro Met Lys Leu Val Glu Lys Leu Asp 115 120125 Gly Thr Cys Val Val Leu Tyr Tyr Asp Glu Gly Trp Lys Ile His Thr 130135 140 Leu Gly Ser Ile Asp Ala Asn Gly Ser Ile Val Lys Asn Gly Met Val145 150 155 160 Thr Thr His Met Asp Lys Thr Tyr Arg Glu Leu Phe Trp GluThr Phe 165 170 175 Glu Lys Lys Tyr Pro Pro Tyr Leu Leu Tyr His Leu AsnSer Ser Tyr 180 185 190 Cys Tyr Ile Phe Glu Met Val His Pro Asp Ala ArgVal Val Val Pro 195 200 205 Tyr Glu Glu Pro Asn Ile Ile Leu Ile Gly ValArg Ser Val Asp Pro 210 215 220 Glu Lys Gly Tyr Phe Glu Val Gly Pro SerGlu Glu Ala Val Arg Ile 225 230 235 240 Phe Asn Glu Ser Gly Gly Lys IleAsn Leu Lys Leu Pro Ala Val Leu 245 250 255 Ser Gln Glu Gln Asn Tyr ThrLeu Phe Arg Ala Asn Arg Leu Gln Glu 260 265 270 Leu Phe Glu Glu Val ThrPro Leu Phe Lys Ser Leu Arg Asp Gly Tyr 275 280 285 Glu Val Val Tyr GluGly Phe Val Ala Val Gln Glu Ile Ala Pro Arg 290 295 300 Val Tyr Tyr ArgThr Lys Ile Lys His Pro Val Tyr Leu Glu Leu His 305 310 315 320 Arg IleLys Thr Thr Ile Thr Pro Glu Lys Leu Ala Asp Leu Phe Leu 325 330 335 GluAsn Lys Leu Asp Asp Phe Val Leu Thr Pro Asp Glu Gln Glu Thr 340 345 350Val Met Lys Leu Lys Glu Ile Tyr Thr Asp Met Arg Asn Gln Leu Glu 355 360365 Ser Ser Phe Asp Thr Ile Tyr Lys Glu Ile Ser Glu Gln Val Ser Pro 370375 380 Glu Glu Asn Pro Gly Glu Phe Arg Lys Arg Phe Ala Leu Arg Leu Met385 390 395 400 Asp Tyr His Asp Lys Ser Trp Phe Phe Ala Arg Leu Asp GlyAsp Glu 405 410 415 Glu Lys Met Gln Lys Ser Glu Lys Lys Leu Leu Thr GluArg Ile Glu 420 425 430 Lys Gly Leu Phe Lys 435 41 300 PRT Escherichiacoli 41 Met Val Gln Ile Pro Gln Asn Pro Leu Ile Leu Val Asp Gly Ser Ser1 5 10 15 Tyr Leu Tyr Arg Ala Tyr His Ala Phe Pro Pro Leu Thr Asn SerAla 20 25 30 Gly Glu Pro Thr Gly Ala Met Tyr Gly Val Leu Asn Met Leu ArgSer 35 40 45 Leu Ile Met Gln Tyr Lys Pro Thr His Ala Ala Val Val Phe AspAla 50 55 60 Lys Gly Lys Thr Phe Arg Asp Glu Leu Phe Glu His Tyr Lys SerHis 65 70 75 80 Arg Pro Pro Met Pro Asp Asp Leu Arg Ala Gln Ile Glu ProLeu His 85 90 95 Ala Met Val Lys Ala Met Gly Leu Pro Leu Leu Ala Val SerGly Val 100 105 110 Glu Ala Asp Asp Val Ile Gly Thr Leu Ala Arg Glu AlaGlu Lys Ala 115 120 125 Gly Arg Pro Val Leu Ile Ser Thr Gly Asp Lys AspMet Ala Gln Leu 130 135 140 Val Thr Pro Asn Ile Thr Leu Ile Asn Thr MetThr Asn Thr Ile Leu 145 150 155 160 Gly Pro Glu Glu Val Val Asn Lys TyrGly Val Pro Pro Glu Leu Ile 165 170 175 Ile Asp Phe Leu Ala Leu Met GlyAsp Ser Ser Asp Asn Ile Pro Gly 180 185 190 Val Pro Gly Val Gly Glu LysThr Ala Gln Ala Leu Leu Gln Gly Leu 195 200 205 Gly Gly Leu Asp Thr LeuTyr Ala Glu Pro Glu Lys Ile Ala Gly Leu 210 215 220 Ser Phe Arg Gly AlaLys Thr Met Ala Ala Lys Leu Glu Gln Asn Lys 225 230 235 240 Glu Val AlaTyr Leu Ser Tyr Gln Leu Ala Thr Ile Lys Thr Asp Val 245 250 255 Glu LeuGlu Leu Thr Cys Glu Gln Leu Glu Val Gln Gln Pro Ala Ala 260 265 270 GluGlu Leu Leu Gly Leu Phe Lys Lys Tyr Glu Phe Lys Arg Trp Thr 275 280 285Ala Asp Val Glu Ala Gly Lys Trp Leu Gln Ala Lys 290 295 300 42 300 PRTThermus aquaticaus 42 Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys GlyArg Val Leu Leu 1 5 10 15 Val Asp Gly His His Leu Ala Tyr Arg Thr PheHis Ala Leu Lys Gly 20 25 30 Leu Thr Thr Ser Arg Gly Glu Pro Val Gln AlaVal Tyr Gly Phe Ala 35 40 45 Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp GlyAsp Ala Val Ile Val 50 55 60 Val Phe Asp Ala Lys Ala Pro Ser Phe Arg HisGlu Ala Tyr Gly Gly 65 70 75 80 Tyr Lys Ala Gly Arg Ala Pro Thr Pro GluAsp Phe Pro Arg Gln Leu 85 90 95 Ala Leu Ile Lys Glu Leu Val Asp Leu LeuGly Leu Ala Arg Leu Glu 100 105 110 Val Pro Gly Tyr Glu Ala Asp Asp ValLeu Ala Ser Leu Ala Lys Lys 115 120 125 Ala Glu Lys Glu Gly Tyr Glu ValArg Ile Leu Thr Ala Asp Lys Asp 130 135 140 Leu Tyr Gln Leu Leu Ser AspArg Ile His Val Leu His Pro Glu Gly 145 150 155 160 Tyr Leu Ile Thr ProAla Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 165 170 175 Asp Gln Trp AlaAsp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 180 185 190 Leu Pro GlyVal Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu 195 200 205 Glu GluTrp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 210 215 220 LysPro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu Lys 225 230 235240 Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val 245250 255 Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe260 265 270 Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly LeuLeu 275 280 285 Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro 290 295300 43 318 PRT Bacteriophage RM378 43 Met Lys Arg Leu Arg Asn Met ValAsn Leu Ile Asp Leu Lys Asn Gln 1 5 10 15 Tyr Tyr Ala Tyr Ser Phe LysPhe Phe Asp Ser Tyr Gln Ile Ser Trp 20 25 30 Asp Asn Tyr Pro His Leu LysGlu Phe Val Ile Glu Asn Tyr Pro Gly 35 40 45 Thr Tyr Phe Ser Cys Tyr AlaPro Gly Ile Leu Tyr Lys Leu Phe Leu 50 55 60 Lys Trp Lys Arg Gly Met IleIle Asp Asp Tyr Asp Arg His Pro Leu 65 70 75 80 Arg Lys Lys Leu Leu ProGln Tyr Lys Glu His Arg Tyr Glu Tyr Ile 85 90 95 Glu Gly Lys Tyr Gly ValVal Pro Phe Pro Gly Phe Leu Lys Tyr Leu 100 105 110 Lys Phe His Phe GluAsp Leu Arg Phe Lys Met Arg Asp Leu Gly Ile 115 120 125 Thr Asp Phe LysTyr Ala Leu Ala Ile Ser Leu Phe Tyr Asn Arg Val 130 135 140 Met Leu ArgAsp Phe Leu Lys Asn Phe Thr Cys Tyr Tyr Ile Ala Glu 145 150 155 160 TyrGlu Ala Asp Asp Val Ile Ala His Leu Ala Arg Glu Ile Ala Arg 165 170 175Ser Asn Ile Asp Val Asn Ile Val Ser Thr Asp Lys Asp Tyr Tyr Gln 180 185190 Leu Trp Asp Glu Glu Asp Ile Arg Glu Arg Val Tyr Ile Asn Ser Leu 195200 205 Ser Cys Ser Asp Val Lys Thr Pro Arg Tyr Gly Phe Leu Thr Ile Lys210 215 220 Ala Leu Leu Gly Asp Lys Ser Asp Asn Ile Pro Lys Ser Leu GluLys 225 230 235 240 Gly Lys Gly Glu Lys Tyr Leu Glu Lys Lys Gly Phe AlaGlu Glu Asp 245 250 255 Tyr Asp Lys Glu Leu Phe Glu Asn Asn Leu Lys ValIle Arg Phe Gly 260 265 270 Asp Glu Tyr Leu Gly Glu Arg Asp Lys Ser PheIle Glu Asn Phe Ser 275 280 285 Thr Gly Asp Thr Leu Trp Asn Phe Tyr GluPhe Phe Tyr Tyr Asp Pro 290 295 300 Leu His Glu Leu Phe Leu Arg Asn IleArg Lys Arg Arg Leu 305 310 315 44 305 PRT Bacteriophage T4 44 Met AspLeu Glu Met Met Leu Asp Glu Asp Tyr Lys Glu Gly Ile Cys 1 5 10 15 LeuIle Asp Phe Ser Gln Ile Ala Leu Ser Thr Ala Leu Val Asn Phe 20 25 30 ProAsp Lys Glu Lys Ile Asn Leu Ser Met Val Arg His Leu Ile Leu 35 40 45 AsnSer Ile Lys Phe Asn Val Lys Lys Ala Lys Thr Leu Gly Tyr Thr 50 55 60 LysIle Val Leu Cys Ile Asp Asn Ala Lys Ser Gly Tyr Trp Arg Arg 65 70 75 80Asp Phe Ala Tyr Tyr Tyr Lys Lys Asn Arg Gly Lys Ala Arg Glu Glu 85 90 95Ser Thr Trp Asp Trp Glu Gly Tyr Phe Glu Ser Ser His Lys Val Ile 100 105110 Asp Glu Leu Lys Ala Tyr Met Pro Tyr Ile Val Met Asp Ile Asp Lys 115120 125 Tyr Glu Ala Asp Asp His Ile Ala Val Leu Val Lys Lys Phe Ser Leu130 135 140 Glu Gly His Lys Ile Leu Ile Ile Ser Ser Asp Gly Asp Phe ThrGln 145 150 155 160 Leu His Lys Tyr Pro Asn Val Lys Gln Trp Ser Pro MetHis Lys Lys 165 170 175 Trp Val Lys Ile Lys Ser Gly Ser Ala Glu Ile AspCys Met Thr Lys 180 185 190 Ile Leu Lys Gly Asp Lys Lys Asp Asn Val AlaSer Val Lys Val Arg 195 200 205 Ser Asp Phe Trp Phe Thr Arg Val Glu GlyGlu Arg Thr Pro Ser Met 210 215 220 Lys Thr Ser Ile Val Glu Ala Ile AlaAsn Asp Arg Glu Gln Ala Lys 225 230 235 240 Val Leu Leu Thr Glu Ser GluTyr Asn Arg Tyr Lys Glu Asn Leu Val 245 250 255 Leu Ile Asp Phe Asp TyrIle Pro Asp Asn Ile Ala Ser Asn Ile Val 260 265 270 Asn Tyr Tyr Asn SerTyr Lys Leu Pro Pro Arg Gly Lys Ile Tyr Ser 275 280 285 Tyr Phe Val LysAla Gly Leu Ser Lys Leu Thr Asn Ser Ile Asn Glu 290 295 300 Phe 305 45300 PRT Bacteriophage T7 45 Met Ala Leu Leu Asp Leu Lys Gln Phe Tyr GluLeu Arg Glu Gly Cys 1 5 10 15 Asp Asp Lys Gly Ile Leu Val Met Asp GlyAsp Trp Leu Val Phe Gln 20 25 30 Ala Met Ser Ala Ala Glu Phe Asp Ala SerTrp Glu Glu Glu Ile Trp 35 40 45 His Arg Cys Cys Asp His Ala Lys Ala ArgGln Ile Leu Glu Asp Ser 50 55 60 Ile Lys Ser Tyr Glu Thr Arg Lys Lys AlaTrp Ala Gly Ala Pro Ile 65 70 75 80 Val Leu Ala Phe Thr Asp Ser Val AsnTrp Arg Lys Glu Leu Val Asp 85 90 95 Pro Asn Tyr Lys Ala Asn Arg Lys AlaVal Lys Lys Pro Val Gly Tyr 100 105 110 Phe Glu Phe Leu Asp Ala Leu PheGlu Arg Glu Glu Phe Tyr Cys Ile 115 120 125 Arg Glu Pro Met Leu Glu GlyAsp Asp Val Met Gly Val Ile Ala Ser 130 135 140 Asn Pro Ser Ala Phe GlyAla Arg Lys Ala Val Ile Ile Ser Cys Asp 145 150 155 160 Lys Asp Phe LysThr Ile Pro Asn Cys Asp Phe Leu Trp Cys Thr Thr 165 170 175 Gly Asn IleLeu Thr Gln Thr Glu Glu Ser Ala Asp Trp Trp His Leu 180 185 190 Phe GlnThr Ile Lys Gly Asp Ile Thr Asp Gly Tyr Ser Gly Ile Ala 195 200 205 GlyTrp Gly Asp Thr Ala Glu Asp Phe Leu Asn Asn Pro Phe Ile Thr 210 215 220Glu Pro Lys Thr Ser Val Leu Lys Ser Gly Lys Asn Lys Gly Gln Glu 225 230235 240 Val Thr Lys Trp Val Lys Arg Asp Pro Glu Pro His Glu Thr Leu Trp245 250 255 Asp Cys Ile Lys Ser Ile Gly Ala Lys Ala Gly Met Thr Glu GluAsp 260 265 270 Ile Ile Lys Gln Gly Gln Met Ala Arg Ile Leu Arg Phe AsnGlu Tyr 275 280 285 Asn Phe Ile Asp Lys Glu Ile Tyr Leu Trp Arg Pro 290295 300 46 287 PRT Escherichia coli 46 Val Leu Asp Ala Thr Val Ala ArgIle Glu Gln Leu Phe Gln Gln Pro 1 5 10 15 His Asp Gly Val Thr Gly ValAsn Thr Gly Tyr Asp Asp Leu Asn Lys 20 25 30 Lys Thr Ala Gly Leu Gln ProSer Asp Leu Ile Ile Val Ala Ala Arg 35 40 45 Pro Ser Met Gly Lys Thr ThrPhe Ala Met Asn Leu Val Glu Asn Ala 50 55 60 Ala Met Leu Gln Asp Lys ProVal Leu Ile Phe Ser Leu Glu Met Pro 65 70 75 80 Ser Glu Gln Ile Met MetArg Ser Leu Ala Ser Leu Ser Arg Val Asp 85 90 95 Gln Thr Lys Ile Arg ThrGly Gln Leu Asp Asp Glu Asp Trp Ala Arg 100 105 110 Ile Ser Gly Thr MetGly Ile Leu Leu Glu Lys Arg Asn Ile Tyr Ile 115 120 125 Asp Asp Ser SerGly Leu Thr Pro Thr Glu Val Arg Ser Arg Ala Arg 130 135 140 Arg Ile AlaArg Glu His Gly Gly Ile Gly Leu Ile Met Ile Asp Tyr 145 150 155 160 LeuGln Leu Met Arg Val Pro Ala Leu Ser Asp Asn Arg Thr Leu Glu 165 170 175Ile Ala Glu Ile Ser Arg Ser Leu Lys Ala Leu Ala Lys Glu Leu Asn 180 185190 Val Pro Val Val Ala Leu Ser Gln Leu Asn Arg Ser Leu Glu Gln Arg 195200 205 Ala Asp Lys Arg Pro Val Asn Ser Asp Leu Arg Glu Ser Gly Ser Ile210 215 220 Glu Gln Asp Ala Asp Leu Ile Met Phe Ile Tyr Arg Asp Glu ValTyr 225 230 235 240 His Glu Asn Ser Asp Leu Lys Gly Ile Ala Glu Ile IleIle Gly Lys 245 250 255 Gln Arg Asn Gly Pro Ile Gly Thr Val Arg Leu ThrPhe Asn Gly Gln 260 265 270 Trp Ser Arg Phe Asp Asn Tyr Ala Gly Pro GlnTyr Asp Asp Glu 275 280 285 47 291 PRT Haemophilus influenza 47 Val LeuGlu Ser Thr Ile Glu Lys Ile Asp Ile Leu Ser Lys Leu Glu 1 5 10 15 AsnHis Ser Gly Val Thr Gly Val Thr Thr Gly Phe Thr Asp Leu Asp 20 25 30 LysLys Thr Ala Gly Leu Gln Pro Ser Asp Leu Ile Ile Val Ala Ala 35 40 45 ArgPro Ser Met Gly Lys Thr Thr Phe Ala Met Asn Leu Cys Glu Asn 50 55 60 AlaAla Met Ala Ser Glu Lys Pro Val Leu Val Phe Ser Leu Glu Met 65 70 75 80Pro Ala Glu Gln Ile Met Met Arg Met Ile Ala Ser Leu Ala Arg Val 85 90 95Asp Gln Thr Lys Ile Arg Thr Gly Gln Asn Leu Asp Glu Ile Glu Trp 100 105110 Asn Lys Ile Ala Ser Val Val Gly Met Phe Lys Gln Lys Asn Asn Leu 115120 125 Phe Ile Asp Asp Ser Ser Gly Leu Thr Pro Thr Asp Val Arg Ser Arg130 135 140 Ala Arg Arg Val Tyr Arg Glu Asn Gly Gly Leu Ser Met Ile MetVal 145 150 155 160 Asp Tyr Leu Gln Leu Met Arg Ala Pro Ala Phe Ser AspAsn Arg Thr 165 170 175 Leu Glu Ile Ala Glu Ile Ser Arg Ser Leu Lys AlaLeu Ala Lys Glu 180 185 190 Leu Gln Val Pro Val Val Ala Leu Ser Gln LeuAsn Arg Thr Leu Glu 195 200 205 Gln Arg Gly Asp Lys Arg Pro Val Asn SerAsp Leu Arg Glu Ser Gly 210 215 220 Ser Ile Glu Gln Asp Ala Asp Leu IleMet Phe Ile Tyr Arg Asp Glu 225 230 235 240 Val Tyr Asn Asp Asn Ser GluAsp Lys Gly Val Ala Glu Ile Ile Ile 245 250 255 Gly Lys Gln Arg Asn GlyPro Ile Gly Arg Val Arg Leu Lys Phe Asn 260 265 270 Gly Gln Phe Ser ArgPhe Asp Asn Leu Ala Glu Gln Arg Glu Tyr Arg 275 280 285 Asp Asp Tyr 29048 287 PRT Chlamydomonas trachomatis 48 Ala Leu Gln Glu Arg Gln Glu AlaPhe Gln Ala Ser Ala His Asp Ser 1 5 10 15 Ser Ser Pro Met Leu Ser GlyPhe Pro Thr His Phe Leu Asp Leu Asp 20 25 30 Lys Met Ile Ser Gly Phe SerPro Ser Asn Leu Ile Ile Leu Ala Ala 35 40 45 Arg Pro Ala Met Gly Lys ThrAla Leu Ala Leu Asn Ile Val Glu Asn 50 55 60 Phe Cys Phe Asp Ser Arg LeuPro Val Gly Ile Phe Ser Leu Glu Met 65 70 75 80 Thr Val Asp Gln Leu IleHis Arg Ile Ile Cys Ser Arg Ser Glu Val 85 90 95 Glu Ala Lys Lys Ile SerVal Gly Asp Ile Ser Gly Arg Asp Phe Gln 100 105 110 Arg Val Val Ser ValVal Arg Glu Met Glu Glu His Thr Leu Leu Ile 115 120 125 Asp Asp Tyr ProGly Leu Lys Ile Thr Asp Leu Arg Ala Arg Ala Arg 130 135 140 Arg Met LysGlu Ser Tyr Asp Ile Gln Phe Leu Val Ile Asp Tyr Leu 145 150 155 160 GlnLeu Ile Ser Ser Ser Gly Asn Leu Arg Asn Ser Asp Ser Arg Asn 165 170 175Gln Glu Ile Ser Glu Ile Ser Arg Met Leu Lys Asn Leu Ala Arg Glu 180 185190 Leu Asn Ile Pro Ile Leu Cys Leu Ser Gln Leu Ser Arg Lys Val Glu 195200 205 Asp Arg Ala Asn His Arg Pro Leu Met Ser Asp Leu Arg Glu Ser Gly210 215 220 Ser Ile Glu Gln Asp Ala Asp Gln Ile Met Phe Leu Leu Arg ArgGlu 225 230 235 240 Tyr Tyr Asp Pro Asn Asp Lys Pro Gly Thr Ala Glu LeuIle Val Ala 245 250 255 Lys Asn Arg His Gly Ser Ile Gly Ser Val Gln LeuVal Phe Glu Lys 260 265 270 Asp Phe Ala Arg Phe Arg Asn Tyr Ala Gly CysGlu Phe Pro Gly 275 280 285 49 290 PRT Bacillus stearothermophilus 49Ile Leu Val Gln Thr Tyr Asp Asn Ile Glu Met Leu His Asn Arg Asp 1 5 1015 Gly Glu Ile Thr Gly Ile Pro Thr Gly Phe Thr Glu Leu Asp Arg Met 20 2530 Thr Ser Gly Phe Gln Arg Ser Asp Leu Ile Ile Val Ala Ala Arg Pro 35 4045 Ser Val Gly Lys Thr Ala Phe Ala Leu Asn Ile Ala Gln Asn Val Ala 50 5560 Thr Lys Thr Asn Glu Asn Val Ala Ile Phe Ser Leu Glu Met Ser Ala 65 7075 80 Gln Gln Leu Val Met Arg Met Leu Cys Ala Glu Gly Asn Ile Asn Ala 8590 95 Gln Asn Leu Arg Thr Gly Lys Leu Thr Pro Glu Asp Trp Gly Lys Leu100 105 110 Thr Met Ala Met Gly Ser Leu Ser Asn Ala Gly Ile Tyr Ile AspAsp 115 120 125 Thr Pro Ser Ile Arg Val Ser Asp Ile Arg Ala Lys Cys ArgArg Leu 130 135 140 Lys Gln Glu Ser Gly Leu Gly Met Ile Val Ile Asp TyrLeu Gln Leu 145 150 155 160 Ile Gln Gly Ser Gly Arg Ser Lys Glu Asn ArgGln Gln Glu Val Ser 165 170 175 Glu Ile Ser Arg Ser Leu Lys Ala Leu AlaArg Glu Leu Glu Val Pro 180 185 190 Val Ile Ala Leu Ser Gln Leu Ser ArgSer Val Glu Gln Arg Gln Asp 195 200 205 Lys Arg Pro Met Met Ser Asp IleArg Glu Ser Gly Ser Ile Glu Gln 210 215 220 Asp Ala Asp Ile Val Ala PheLeu Tyr Arg Asp Asp Tyr Tyr Asn Lys 225 230 235 240 Asp Ser Glu Asn LysAsn Ile Ile Glu Ile Ile Ile Ala Lys Gln Arg 245 250 255 Asn Gly Pro ValGly Thr Val Gln Leu Ala Phe Ile Lys Glu Tyr Asn 260 265 270 Lys Phe ValAsn Leu Glu Arg Arg Phe Asp Glu Ala Gln Ile Pro Pro 275 280 285 Gly Ala290 50 332 PRT Halobacter pylori 50 Val Leu Glu Ser Ala Met Asp Leu IleThr Glu Asn Gln Arg Lys Gly 1 5 10 15 Ser Leu Glu Val Thr Gly Ile ProThr Gly Phe Val Gln Leu Asp Asn 20 25 30 Tyr Thr Ser Gly Phe Asn Lys GlySer Leu Val Ile Ile Gly Ala Arg 35 40 45 Pro Ser Met Gly Lys Thr Ser LeuMet Met Asn Met Val Leu Ser Ala 50 55 60 Leu Asn Asp Asp Arg Gly Val AlaVal Phe Ser Leu Glu Met Ser Ala 65 70 75 80 Glu Gln Leu Ala Leu Arg AlaLeu Ser Asp Leu Thr Ser Ile Asn Met 85 90 95 His Asp Leu Glu Ser Gly ArgLeu Asp Asp Asp Gln Trp Glu Asn Leu 100 105 110 Ala Lys Cys Phe Asp HisLeu Ser Gln Lys Lys Leu Phe Phe Tyr Asp 115 120 125 Lys Ser Tyr Val ArgIle Glu Gln Ile Arg Leu Gln Leu Arg Lys Leu 130 135 140 Lys Ser Gln HisLys Glu Leu Gly Ile Ala Phe Ile Asp Tyr Leu Gln 145 150 155 160 Leu MetSer Gly Ser Lys Ala Thr Lys Glu Arg His Glu Gln Ile Ala 165 170 175 GluIle Ser Arg Glu Leu Lys Thr Leu Ala Arg Glu Leu Glu Ile Pro 180 185 190Ile Ile Ala Leu Val Gln Leu Asn Arg Ser Leu Glu Asn Arg Asp Asp 195 200205 Lys Arg Pro Ile Leu Ser Asp Ile Lys Asp Ser Gly Gly Ile Glu Gln 210215 220 Asp Ala Asp Ile Val Leu Phe Leu Tyr Arg Gly Tyr Ile Tyr Gln Met225 230 235 240 Arg Ala Glu Asp Asn Lys Ile Asp Lys Leu Lys Lys Glu GlyLys Ile 245 250 255 Glu Glu Ala Gln Glu Leu Tyr Leu Lys Val Asn Glu GluArg Arg Ile 260 265 270 His Lys Gln Asn Gly Ser Ile Glu Glu Ala Glu IleIle Val Ala Lys 275 280 285 Asn Arg Asn Gly Ala Thr Gly Thr Val Tyr ThrArg Phe Asn Ala Pro 290 295 300 Phe Thr Arg Tyr Glu Asp Met Pro Ile AspSer His Leu Glu Glu Gly 305 310 315 320 Gln Glu Thr Lys Val Asp Tyr AspIle Val Thr Thr 325 330 51 295 PRT Mycolplasma genitalium 51 Glu Ile AlaAsn Gln Glu Glu Ala Leu Ile Lys Lys Val His Arg Gly 1 5 10 15 Glu LeuIle Ile Ser Gly Leu Ser Ser Gly Phe Leu Lys Leu Asp Gln 20 25 30 Leu ThrSer Gly Trp Lys Pro Gly Glu Leu Ile Val Ile Ala Ala Arg 35 40 45 Pro GlyArg Gly Lys Thr Ala Leu Leu Ile Asn Phe Met Ala Ser Ala 50 55 60 Ala LysGln Ile Asp Pro Lys Thr Asp Val Val Leu Phe Phe Ser Leu 65 70 75 80 GluMet Arg Asn Arg Glu Ile Tyr Gln Arg His Leu Met His Glu Ser 85 90 95 GlnThr Ser Tyr Thr Leu Thr Asn Arg Gln Arg Ile Asn Asn Val Phe 100 105 110Glu Glu Leu Met Glu Ala Ser Ser Arg Ile Lys Asn Leu Pro Ile Lys 115 120125 Leu Phe Asp Tyr Ser Ser Leu Thr Leu Gln Glu Ile Arg Asn Gln Ile 130135 140 Thr Glu Val Ser Lys Thr Ser Asn Val Arg Leu Val Ile Ile Asp Tyr145 150 155 160 Leu Gln Leu Val Asn Ala Leu Lys Asn Asn Tyr Gly Leu ThrArg Gln 165 170 175 Gln Glu Val Thr Met Ile Ser Gln Ser Leu Lys Ala PheAla Lys Glu 180 185 190 Phe Asn Thr Pro Ile Ile Ala Ala Ala Gln Leu SerArg Arg Ile Glu 195 200 205 Glu Arg Lys Asp Ser Arg Pro Ile Leu Ser AspLeu Arg Glu Ser Gly 210 215 220 Ser Ile Glu Gln Asp Ala Asp Met Val LeuPhe Ile His Arg Thr Asn 225 230 235 240 Asp Asp Lys Lys Glu Gln Glu GluGlu Asn Thr Asn Leu Phe Glu Val 245 250 255 Glu Leu Ile Leu Glu Lys AsnArg Asn Gly Pro Asn Gly Lys Val Lys 260 265 270 Leu Asn Phe Arg Ser AspThr Ser Ser Phe Ile Ser Gln Tyr Ser Pro 275 280 285 Ser Phe Asp Asp GlnTyr Ser 290 295 52 283 PRT Borrelia burgdorferi 52 Ile Ala Glu Arg ValHis Asn Glu Ile Tyr Glu Arg Ser Met Lys Lys 1 5 10 15 Lys Glu Ala AsnPhe Gly Ile Pro Ser Gly Phe Arg Lys Val Asp Ser 20 25 30 Leu Ile Gly GlyPhe Arg Asn Ser Asp Phe Ile Ile Val Gly Ala Arg 35 40 45 Pro Ser Ile GlyLys Thr Ala Phe Ala Leu Asn Ile Ala Ser Tyr Ile 50 55 60 Ala Leu Arg LysGlu Glu Lys Lys Lys Val Gly Phe Phe Ser Leu Glu 65 70 75 80 Met Thr AlaAsp Ala Leu Ile Lys Arg Ile Ile Ser Ser Gln Ser Cys 85 90 95 Ile Asp SerPhe Lys Val Gln Asn Ser Ile Leu Ser Gly Gln Glu Ile 100 105 110 Lys SerLeu Asn Asp Ile Ile Asn Glu Ile Ser Asp Ser Glu Leu Tyr 115 120 125 IleGlu Asp Thr Pro Asn Ile Ser Leu Leu Thr Leu Ala Thr Gln Ala 130 135 140Arg Lys Leu Lys Arg Phe Tyr Gly Ile Asp Ile Ile Phe Val Asp Tyr 145 150155 160 Ile Ser Leu Ile Ser Phe Glu Thr Lys Asn Leu Pro Arg His Glu Gln165 170 175 Val Ala Ser Ile Ser Lys Ser Leu Lys Glu Leu Ala Arg Glu LeuGlu 180 185 190 Ile Pro Ile Val Ala Leu Ser Gln Leu Thr Arg Asp Thr GluGly Arg 195 200 205 Glu Pro Asn Leu Ala Ser Leu Arg Glu Ser Gly Ala LeuGlu Gln Asp 210 215 220 Ala Asp Ile Val Ile Leu Leu His Arg Asp Lys AspPhe Lys Phe Glu 225 230 235 240 Ser Ser Ala Glu Ile Glu Pro Ile Glu ThrLys Val Ile Val Ala Lys 245 250 255 His Arg Asn Gly Pro Thr Gly Arg AlaAsp Ile Leu Phe Leu Pro His 260 265 270 Ile Thr Lys Phe Val Asn Lys AspHis Gln Tyr 275 280 53 327 PRT Bacteriophage T4 53 Tyr Val Gly His AspTrp Met Asp Asp Tyr Glu Ala Arg Trp Leu Ser 1 5 10 15 Tyr Met Asn LysAla Arg Lys Val Pro Phe Lys Leu Arg Ile Leu Asn 20 25 30 Lys Ile Thr LysGly Gly Ala Glu Thr Gly Thr Leu Asn Val Leu Met 35 40 45 Ala Gly Val AsnVal Gly Lys Ser Leu Gly Leu Cys Ser Leu Ala Ala 50 55 60 Asp Tyr Leu GlnLeu Gly His Asn Val Leu Tyr Ile Ser Met Glu Met 65 70 75 80 Ala Glu GluVal Cys Ala Lys Arg Ile Asp Ala Asn Met Leu Asp Val 85 90 95 Ser Leu AspAsp Ile Asp Asp Gly His Ile Ser Tyr Ala Glu Tyr Lys 100 105 110 Gly LysMet Glu Lys Trp Arg Glu Lys Ser Thr Leu Gly Arg Leu Ile 115 120 125 ValLys Gln Tyr Pro Thr Gly Gly Ala Asp Ala Asn Thr Phe Arg Ser 130 135 140Leu Leu Asn Glu Leu Lys Leu Lys Lys Asn Phe Val Pro Thr Ile Ile 145 150155 160 Ile Val Asp Tyr Leu Gly Ile Cys Lys Ser Cys Arg Ile Arg Val Tyr165 170 175 Ser Glu Asn Ser Tyr Thr Thr Val Lys Ala Ile Ala Glu Glu LeuArg 180 185 190 Ala Leu Ala Val Glu Thr Glu Thr Val Leu Trp Thr Ala AlaGln Val 195 200 205 Gly Lys Gln Ala Trp Asp Ser Ser Asp Val Asn Met SerAsp Ile Ala 210 215 220 Glu Ser Ala Gly Leu Pro Ala Thr Ala Asp Phe MetLeu Ala Val Ile 225 230 235 240 Glu Thr Glu Glu Leu Ala Ala Ala Glu GlnGln Leu Ile Lys Gln Ile 245 250 255 Lys Ser Arg Tyr Gly Asp Lys Asn LysTrp Asn Lys Phe Leu Met Gly 260 265 270 Val Gln Lys Gly Asn Gln Lys TrpVal Glu Ile Glu Gln Asp Ser Thr 275 280 285 Pro Thr Glu Val Asn Glu ValAla Gly Ser Gln Gln Ile Gln Ala Glu 290 295 300 Gln Asn Arg Tyr Gln ArgAsn Glu Ser Thr Arg Ala Gln Leu Asp Ala 305 310 315 320 Leu Ala Asn GluLeu Lys Phe 325 54 302 PRT Bacteriophage T7 54 Val Val Ser Ala Leu SerLeu Arg Glu Arg Ile Arg Glu His Leu Ser 1 5 10 15 Ser Glu Glu Ser ValGly Leu Leu Phe Ser Gly Cys Thr Gly Ile Asn 20 25 30 Asp Lys Thr Leu GlyAla Arg Gly Gly Glu Val Ile Met Val Thr Ser 35 40 45 Gly Ser Gly Met GlyLys Ser Thr Phe Val Arg Gln Gln Ala Leu Gln 50 55 60 Trp Gly Thr Ala MetGly Lys Lys Val Gly Leu Ala Met Leu Glu Glu 65 70 75 80 Ser Val Glu GluThr Ala Glu Asp Leu Ile Gly Leu His Asn Arg Val 85 90 95 Arg Leu Arg GlnSer Asp Ser Leu Lys Arg Glu Ile Ile Glu Asn Gly 100 105 110 Lys Phe AspGln Trp Phe Asp Glu Leu Phe Gly Asn Asp Thr Phe His 115 120 125 Leu TyrAsp Ser Phe Ala Glu Ala Glu Thr Asp Arg Leu Leu Ala Lys 130 135 140 LeuAla Tyr Met Arg Ser Gly Leu Gly Cys Asp Val Ile Ile Leu Asp 145 150 155160 His Ile Ser Ile Val Val Ser Ala Ser Gly Glu Ser Asp Glu Arg Lys 165170 175 Met Ile Asp Asn Leu Met Thr Lys Leu Lys Gly Phe Ala Lys Ser Thr180 185 190 Gly Val Val Leu Val Val Ile Cys His Leu Lys Asn Pro Asp LysGly 195 200 205 Lys Ala His Glu Glu Gly Arg Pro Val Ser Ile Thr Asp LeuArg Gly 210 215 220 Ser Gly Ala Leu Arg Gln Leu Ser Asp Thr Ile Ile AlaLeu Glu Arg 225 230 235 240 Asn Gln Gln Gly Asp Met Pro Asn Leu Val LeuVal Arg Ile Leu Lys 245 250 255 Cys Arg Phe Thr Gly Asp Thr Gly Ile AlaGly Tyr Met Glu Tyr Asn 260 265 270 Lys Glu Thr Gly Trp Leu Glu Pro SerSer Tyr Ser Gly Glu Glu Glu 275 280 285 Ser His Ser Glu Ser Thr Asp TrpSer Asn Asp Thr Asp Phe 290 295 300 55 270 PRT Bacteriophage RM378 55Val Ser Leu Val Glu Glu Phe Asp Leu Ala Thr Ser Glu Phe Asn Glu 1 5 1015 Leu Phe Val Lys Glu Glu Arg Ile Pro Thr Pro Trp Glu Ser Val Asn 20 2530 Lys Asn Met Ala Gly Gly Leu Gly Arg Gly Glu Leu Gly Ile Val Met 35 4045 Leu Pro Ser Gly Trp Gly Lys Ser Trp Phe Leu Val Ser Leu Gly Leu 50 5560 His Ala Phe Arg Thr Gly Lys Arg Val Ile Tyr Phe Thr Leu Glu Leu 65 7075 80 Asp Gln Lys Tyr Val Met Lys Arg Phe Leu Lys Met Phe Ala Pro Tyr 8590 95 Cys Lys Gly Arg Ala Ser Ser Tyr Arg Asp Val Tyr Gln Ile Met Lys100 105 110 Glu Leu Met Phe Ser Gln Asp Asn Leu Leu Lys Ile Val Phe CysAsn 115 120 125 Ala Met Glu Asp Ile Glu His Tyr Ile Ala Leu Tyr Asn ProAsp Val 130 135 140 Val Leu Ile Asp Tyr Ala Asp Leu Ile Tyr Asp Val GluThr Asp Lys 145 150 155 160 Glu Lys Asn Tyr Leu Leu Leu Gln Lys Ile TyrArg Lys Leu Arg Leu 165 170 175 Ile Ala Lys Val Tyr Asn Thr Ala Val TrpSer Ala Ser Gln Leu Asn 180 185 190 Arg Gly Ser Leu Ser Lys Gln Ala AspVal Asp Phe Ile Glu Lys Tyr 195 200 205 Ile Ala Asp Ser Phe Ala Lys ValVal Glu Ile Asp Phe Gly Met Ala 210 215 220 Phe Ile Pro Asp Ser Glu AsnSer Thr Pro Asp Ile His Val Gly Phe 225 230 235 240 Gly Lys Ile Phe LysAsn Arg Met Gly Ala Val Arg Lys Leu Glu Tyr 245 250 255 Thr Ile Asn PheGlu Asn Tyr Thr Val Asp Val Ala Val Lys 260 265 270 56 1197 DNABacteriophage RM378 CDS (112)...(1158) 56 attttctgtt ttttcacaggcaagtattcg acatgctcga aacccgcgaa gcttattatc 60 agttgcttca atcgttaaacgatttcctcg aagaagacct gaaggagaat t atg aag 117 Met Lys 1 atc acg cta agcgca agc gta tac ccc cga tcg atg aaa att tac gga 165 Ile Thr Leu Ser AlaSer Val Tyr Pro Arg Ser Met Lys Ile Tyr Gly 5 10 15 gtg gag cta atc gagggg aaa aaa cac tta ttt caa tca ccc gta ccc 213 Val Glu Leu Ile Glu GlyLys Lys His Leu Phe Gln Ser Pro Val Pro 20 25 30 cca cat ttg aag cgc atcgct cag cag aat cga ggg aag att gag gct 261 Pro His Leu Lys Arg Ile AlaGln Gln Asn Arg Gly Lys Ile Glu Ala 35 40 45 50 gag gct ata tcc tat tacatc aga gaa caa aaa agc cac atc acg ccg 309 Glu Ala Ile Ser Tyr Tyr IleArg Glu Gln Lys Ser His Ile Thr Pro 55 60 65 gaa gct ttg tct cag tgc gtcttt atc gat att gag acg att tcc ccg 357 Glu Ala Leu Ser Gln Cys Val PheIle Asp Ile Glu Thr Ile Ser Pro 70 75 80 aaa aaa agc ttt ccc gac ccg tggaga gac cca gtt tat tcc att tcc 405 Lys Lys Ser Phe Pro Asp Pro Trp ArgAsp Pro Val Tyr Ser Ile Ser 85 90 95 atc aaa ccg tat gga aaa ccg gtg gtggta gtg ctt ctc ctt atc acc 453 Ile Lys Pro Tyr Gly Lys Pro Val Val ValVal Leu Leu Leu Ile Thr 100 105 110 aac ccg gag gct cat atc gat aac tttaac aaa ttt acc acc agc gta 501 Asn Pro Glu Ala His Ile Asp Asn Phe AsnLys Phe Thr Thr Ser Val 115 120 125 130 ggg gat aac aca ttt gaa att cattac aga aca ttc ctt tcg gaa aaa 549 Gly Asp Asn Thr Phe Glu Ile His TyrArg Thr Phe Leu Ser Glu Lys 135 140 145 aga ttg ctc gag tat ttc tgg aatgtg ctg aaa cca aaa ttt act ttc 597 Arg Leu Leu Glu Tyr Phe Trp Asn ValLeu Lys Pro Lys Phe Thr Phe 150 155 160 atg ctc gca tgg aac ggt tat cagttc gat tat ccc tac ctg ctc att 645 Met Leu Ala Trp Asn Gly Tyr Gln PheAsp Tyr Pro Tyr Leu Leu Ile 165 170 175 cgt agt cat atc cat gag gtg aatgtc att agt gat aag ttg ctt ccg 693 Arg Ser His Ile His Glu Val Asn ValIle Ser Asp Lys Leu Leu Pro 180 185 190 gac tgg aag ctg gtg cgg aaa atttcc gat cga aac cta cca ttc tat 741 Asp Trp Lys Leu Val Arg Lys Ile SerAsp Arg Asn Leu Pro Phe Tyr 195 200 205 210 ttc aat ccc cgt acc cct gtagaa ttt gtg ttt ttt gat tac atg cgg 789 Phe Asn Pro Arg Thr Pro Val GluPhe Val Phe Phe Asp Tyr Met Arg 215 220 225 ctt tat cgc tcc ttt gtg gcatac aaa gag ttg gag tcc tac cgg ctc 837 Leu Tyr Arg Ser Phe Val Ala TyrLys Glu Leu Glu Ser Tyr Arg Leu 230 235 240 gac tat att gcg cga gag gaaata gga gaa ggt aag gtg gat ttc gac 885 Asp Tyr Ile Ala Arg Glu Glu IleGly Glu Gly Lys Val Asp Phe Asp 245 250 255 gta aga ttc tat cat gag attcct gtc tac ccg gat aaa aag ttg gtg 933 Val Arg Phe Tyr His Glu Ile ProVal Tyr Pro Asp Lys Lys Leu Val 260 265 270 gaa tac aac gcc gta gac gccatt ttg atg gaa gaa atc gaa aat aaa 981 Glu Tyr Asn Ala Val Asp Ala IleLeu Met Glu Glu Ile Glu Asn Lys 275 280 285 290 aac cat att ctc ccg acgctg ttt gaa att gca aga ctt tca aat ctg 1029 Asn His Ile Leu Pro Thr LeuPhe Glu Ile Ala Arg Leu Ser Asn Leu 295 300 305 act ccc gca ctg gca ttgaac gct tcc aat att ctt atc gga aat gtt 1077 Thr Pro Ala Leu Ala Leu AsnAla Ser Asn Ile Leu Ile Gly Asn Val 310 315 320 aca gga aaa ctt ggt gtcaaa ttc gtt gat tac atc aag aaa atc gac 1125 Thr Gly Lys Leu Gly Val LysPhe Val Asp Tyr Ile Lys Lys Ile Asp 325 330 335 acc att aat aca atg ttcaaa aaa ata cct gag taaactatga atatgcagac 1178 Thr Ile Asn Thr Met PheLys Lys Ile Pro Glu 340 345 cattgacgaa acgctttat 1197 57 349 PRTBacteriophage RM378 57 Met Lys Ile Thr Leu Ser Ala Ser Val Tyr Pro ArgSer Met Lys Ile 1 5 10 15 Tyr Gly Val Glu Leu Ile Glu Gly Lys Lys HisLeu Phe Gln Ser Pro 20 25 30 Val Pro Pro His Leu Lys Arg Ile Ala Gln GlnAsn Arg Gly Lys Ile 35 40 45 Glu Ala Glu Ala Ile Ser Tyr Tyr Ile Arg GluGln Lys Ser His Ile 50 55 60 Thr Pro Glu Ala Leu Ser Gln Cys Val Phe IleAsp Ile Glu Thr Ile 65 70 75 80 Ser Pro Lys Lys Ser Phe Pro Asp Pro TrpArg Asp Pro Val Tyr Ser 85 90 95 Ile Ser Ile Lys Pro Tyr Gly Lys Pro ValVal Val Val Leu Leu Leu 100 105 110 Ile Thr Asn Pro Glu Ala His Ile AspAsn Phe Asn Lys Phe Thr Thr 115 120 125 Ser Val Gly Asp Asn Thr Phe GluIle His Tyr Arg Thr Phe Leu Ser 130 135 140 Glu Lys Arg Leu Leu Glu TyrPhe Trp Asn Val Leu Lys Pro Lys Phe 145 150 155 160 Thr Phe Met Leu AlaTrp Asn Gly Tyr Gln Phe Asp Tyr Pro Tyr Leu 165 170 175 Leu Ile Arg SerHis Ile His Glu Val Asn Val Ile Ser Asp Lys Leu 180 185 190 Leu Pro AspTrp Lys Leu Val Arg Lys Ile Ser Asp Arg Asn Leu Pro 195 200 205 Phe TyrPhe Asn Pro Arg Thr Pro Val Glu Phe Val Phe Phe Asp Tyr 210 215 220 MetArg Leu Tyr Arg Ser Phe Val Ala Tyr Lys Glu Leu Glu Ser Tyr 225 230 235240 Arg Leu Asp Tyr Ile Ala Arg Glu Glu Ile Gly Glu Gly Lys Val Asp 245250 255 Phe Asp Val Arg Phe Tyr His Glu Ile Pro Val Tyr Pro Asp Lys Lys260 265 270 Leu Val Glu Tyr Asn Ala Val Asp Ala Ile Leu Met Glu Glu IleGlu 275 280 285 Asn Lys Asn His Ile Leu Pro Thr Leu Phe Glu Ile Ala ArgLeu Ser 290 295 300 Asn Leu Thr Pro Ala Leu Ala Leu Asn Ala Ser Asn IleLeu Ile Gly 305 310 315 320 Asn Val Thr Gly Lys Leu Gly Val Lys Phe ValAsp Tyr Ile Lys Lys 325 330 335 Ile Asp Thr Ile Asn Thr Met Phe Lys LysIle Pro Glu 340 345 58 1764 DNA Bacteriophage RM378 CDS (142)...(1707)58 ctatacggat gaagttttga gaattattga tctttctcca ctcgatggcg tattatacaa 60atgtgattta aaagacacct accttatcga ggtgaaagat acccattttg atcccgcaat 120gtaaaacaaa cgtattctgc t atg aac atc aac aag tat cgt tat cgc ggt 171 MetAsn Ile Asn Lys Tyr Arg Tyr Arg Gly 1 5 10 gct tac att gaa ctt acc aacccc gat att tac ttc aac gta ttc gat 219 Ala Tyr Ile Glu Leu Thr Asn ProAsp Ile Tyr Phe Asn Val Phe Asp 15 20 25 ctt gat ttt aca tcg ctg tac ccctct gta atc agc aaa ttc aat atc 267 Leu Asp Phe Thr Ser Leu Tyr Pro SerVal Ile Ser Lys Phe Asn Ile 30 35 40 gat ccc gct acg ttc gta acg gag ttttac ggg tgt atg cgg gtg gag 315 Asp Pro Ala Thr Phe Val Thr Glu Phe TyrGly Cys Met Arg Val Glu 45 50 55 aac aaa gtg att ccg gta gat cag gaa gaaccg gaa ttc ggg ttt ccc 363 Asn Lys Val Ile Pro Val Asp Gln Glu Glu ProGlu Phe Gly Phe Pro 60 65 70 ctc tac atc ttc gat tca ggg atg aac cct tcttac cgg agt gaa ccc 411 Leu Tyr Ile Phe Asp Ser Gly Met Asn Pro Ser TyrArg Ser Glu Pro 75 80 85 90 ctc ttt gtc atc aac agc ttt gag gaa ctc cggcaa ttt tta aaa agt 459 Leu Phe Val Ile Asn Ser Phe Glu Glu Leu Arg GlnPhe Leu Lys Ser 95 100 105 cga aat atc att atg gtg ccc aac ccg tcg ggtatc tgc tgg ttt tac 507 Arg Asn Ile Ile Met Val Pro Asn Pro Ser Gly IleCys Trp Phe Tyr 110 115 120 agg aaa gag ccg gtt ggc gtg ctt cct tct atcatt cgg gag att ttc 555 Arg Lys Glu Pro Val Gly Val Leu Pro Ser Ile IleArg Glu Ile Phe 125 130 135 acc cga cgt aag gaa gaa cgt aag ctt ttc aaagaa act ggc aac atg 603 Thr Arg Arg Lys Glu Glu Arg Lys Leu Phe Lys GluThr Gly Asn Met 140 145 150 gaa cac cat ttc cgt caa tgg gca ctt aaa attatg atg aac tcc atg 651 Glu His His Phe Arg Gln Trp Ala Leu Lys Ile MetMet Asn Ser Met 155 160 165 170 tac ggt atc ttc gga aac cgt tcg gtg tacatg ggg tgc ctt ccc att 699 Tyr Gly Ile Phe Gly Asn Arg Ser Val Tyr MetGly Cys Leu Pro Ile 175 180 185 gcg gaa agt gta acc gcc gcc ggg cgc atgtct att cgc tcc gtg att 747 Ala Glu Ser Val Thr Ala Ala Gly Arg Met SerIle Arg Ser Val Ile 190 195 200 tct cag att cgc gat cgc ttc att tat tcgcat acc gac tcc att ttc 795 Ser Gln Ile Arg Asp Arg Phe Ile Tyr Ser HisThr Asp Ser Ile Phe 205 210 215 gtc aaa gct ttt acg gat gat ccg gtg gcggaa gcc ggt gag ctt caa 843 Val Lys Ala Phe Thr Asp Asp Pro Val Ala GluAla Gly Glu Leu Gln 220 225 230 gaa cat ctc aac tct ttt atc aat gac tatatg gaa aat aac ttt aat 891 Glu His Leu Asn Ser Phe Ile Asn Asp Tyr MetGlu Asn Asn Phe Asn 235 240 245 250 gca aga gaa gat ttc aag ctg gag ttaaag cag gag ttc gtg ttc aaa 939 Ala Arg Glu Asp Phe Lys Leu Glu Leu LysGln Glu Phe Val Phe Lys 255 260 265 tcc att ctt atc aaa gaa atc aac cgctac ttt gcg gtt act gta gac 987 Ser Ile Leu Ile Lys Glu Ile Asn Arg TyrPhe Ala Val Thr Val Asp 270 275 280 ggt aaa gaa gag atg aag gga atc gaagtg atc aac tct tcg gtg cct 1035 Gly Lys Glu Glu Met Lys Gly Ile Glu ValIle Asn Ser Ser Val Pro 285 290 295 gaa att gtc aag aag tat ttc agg ggttac ctg aag tat atc agc caa 1083 Glu Ile Val Lys Lys Tyr Phe Arg Gly TyrLeu Lys Tyr Ile Ser Gln 300 305 310 ccc gac atc gat gtc att tcc gcc accata gcg ttc tac aat aac ttt 1131 Pro Asp Ile Asp Val Ile Ser Ala Thr IleAla Phe Tyr Asn Asn Phe 315 320 325 330 gtg tct caa aag aat ttc tgg tctatt gaa gat ctc tat cac aaa atg 1179 Val Ser Gln Lys Asn Phe Trp Ser IleGlu Asp Leu Tyr His Lys Met 335 340 345 aaa ata tct tcg tct gac agc gccgaa aga tat gtg gag tat gta gag 1227 Lys Ile Ser Ser Ser Asp Ser Ala GluArg Tyr Val Glu Tyr Val Glu 350 355 360 gaa gtt atg aag atg aaa aag gagaat gtc cca atc tct gag ata ttc 1275 Glu Val Met Lys Met Lys Lys Glu AsnVal Pro Ile Ser Glu Ile Phe 365 370 375 ata aaa atg tat gac cat aca cttccc att cat tat aag gga gcg ctt 1323 Ile Lys Met Tyr Asp His Thr Leu ProIle His Tyr Lys Gly Ala Leu 380 385 390 ttc gct tcc att ata gga tgc aaaccc ccg caa atg gga gac aag atc 1371 Phe Ala Ser Ile Ile Gly Cys Lys ProPro Gln Met Gly Asp Lys Ile 395 400 405 410 tac tgg ttc tac tgc acc atgctg gat cct tcc aga acc aat ctc ccg 1419 Tyr Trp Phe Tyr Cys Thr Met LeuAsp Pro Ser Arg Thr Asn Leu Pro 415 420 425 ctt tct ctg gaa gaa gtt aacccc gaa cat ggg agc ggc gtg tgg gat 1467 Leu Ser Leu Glu Glu Val Asn ProGlu His Gly Ser Gly Val Trp Asp 430 435 440 att ctg aaa gcg gga aag aaaacg cat atc aac agg ctc cgc aat atc 1515 Ile Leu Lys Ala Gly Lys Lys ThrHis Ile Asn Arg Leu Arg Asn Ile 445 450 455 cac gca ctt agc ata cgt gaggat gat gag gag ggt ctt gaa atc gtt 1563 His Ala Leu Ser Ile Arg Glu AspAsp Glu Glu Gly Leu Glu Ile Val 460 465 470 aaa aaa tac ata gat aga gacaaa tac tgt cag atc att tca gag aaa 1611 Lys Lys Tyr Ile Asp Arg Asp LysTyr Cys Gln Ile Ile Ser Glu Lys 475 480 485 490 aca att gat ctg ctg aaaagt ctc ggg tat gtt gaa aat act aca aag 1659 Thr Ile Asp Leu Leu Lys SerLeu Gly Tyr Val Glu Asn Thr Thr Lys 495 500 505 ata aaa acc gtt gag gatctt att cgt ttt ctt gta gag agt gaa aac 1707 Ile Lys Thr Val Glu Asp LeuIle Arg Phe Leu Val Glu Ser Glu Asn 510 515 520 taaacccatt agcgccatgattctcaaatt cgacactgaa ggcattgttc gtatcct 1764 59 522 PRT BacteriophageRM378 59 Met Asn Ile Asn Lys Tyr Arg Tyr Arg Gly Ala Tyr Ile Glu Leu Thr1 5 10 15 Asn Pro Asp Ile Tyr Phe Asn Val Phe Asp Leu Asp Phe Thr SerLeu 20 25 30 Tyr Pro Ser Val Ile Ser Lys Phe Asn Ile Asp Pro Ala Thr PheVal 35 40 45 Thr Glu Phe Tyr Gly Cys Met Arg Val Glu Asn Lys Val Ile ProVal 50 55 60 Asp Gln Glu Glu Pro Glu Phe Gly Phe Pro Leu Tyr Ile Phe AspSer 65 70 75 80 Gly Met Asn Pro Ser Tyr Arg Ser Glu Pro Leu Phe Val IleAsn Ser 85 90 95 Phe Glu Glu Leu Arg Gln Phe Leu Lys Ser Arg Asn Ile IleMet Val 100 105 110 Pro Asn Pro Ser Gly Ile Cys Trp Phe Tyr Arg Lys GluPro Val Gly 115 120 125 Val Leu Pro Ser Ile Ile Arg Glu Ile Phe Thr ArgArg Lys Glu Glu 130 135 140 Arg Lys Leu Phe Lys Glu Thr Gly Asn Met GluHis His Phe Arg Gln 145 150 155 160 Trp Ala Leu Lys Ile Met Met Asn SerMet Tyr Gly Ile Phe Gly Asn 165 170 175 Arg Ser Val Tyr Met Gly Cys LeuPro Ile Ala Glu Ser Val Thr Ala 180 185 190 Ala Gly Arg Met Ser Ile ArgSer Val Ile Ser Gln Ile Arg Asp Arg 195 200 205 Phe Ile Tyr Ser His ThrAsp Ser Ile Phe Val Lys Ala Phe Thr Asp 210 215 220 Asp Pro Val Ala GluAla Gly Glu Leu Gln Glu His Leu Asn Ser Phe 225 230 235 240 Ile Asn AspTyr Met Glu Asn Asn Phe Asn Ala Arg Glu Asp Phe Lys 245 250 255 Leu GluLeu Lys Gln Glu Phe Val Phe Lys Ser Ile Leu Ile Lys Glu 260 265 270 IleAsn Arg Tyr Phe Ala Val Thr Val Asp Gly Lys Glu Glu Met Lys 275 280 285Gly Ile Glu Val Ile Asn Ser Ser Val Pro Glu Ile Val Lys Lys Tyr 290 295300 Phe Arg Gly Tyr Leu Lys Tyr Ile Ser Gln Pro Asp Ile Asp Val Ile 305310 315 320 Ser Ala Thr Ile Ala Phe Tyr Asn Asn Phe Val Ser Gln Lys AsnPhe 325 330 335 Trp Ser Ile Glu Asp Leu Tyr His Lys Met Lys Ile Ser SerSer Asp 340 345 350 Ser Ala Glu Arg Tyr Val Glu Tyr Val Glu Glu Val MetLys Met Lys 355 360 365 Lys Glu Asn Val Pro Ile Ser Glu Ile Phe Ile LysMet Tyr Asp His 370 375 380 Thr Leu Pro Ile His Tyr Lys Gly Ala Leu PheAla Ser Ile Ile Gly 385 390 395 400 Cys Lys Pro Pro Gln Met Gly Asp LysIle Tyr Trp Phe Tyr Cys Thr 405 410 415 Met Leu Asp Pro Ser Arg Thr AsnLeu Pro Leu Ser Leu Glu Glu Val 420 425 430 Asn Pro Glu His Gly Ser GlyVal Trp Asp Ile Leu Lys Ala Gly Lys 435 440 445 Lys Thr His Ile Asn ArgLeu Arg Asn Ile His Ala Leu Ser Ile Arg 450 455 460 Glu Asp Asp Glu GluGly Leu Glu Ile Val Lys Lys Tyr Ile Asp Arg 465 470 475 480 Asp Lys TyrCys Gln Ile Ile Ser Glu Lys Thr Ile Asp Leu Leu Lys 485 490 495 Ser LeuGly Tyr Val Glu Asn Thr Thr Lys Ile Lys Thr Val Glu Asp 500 505 510 LeuIle Arg Phe Leu Val Glu Ser Glu Asn 515 520 60 1619 DNA BacteriophageRM378 60 ccggtttgat acccgtattg gtcatttcct tgtggaaacc ccggttgaaaagtggagtaa 60 caaaatgttg cgcgtagctg aaaaacttgt aaccaattcc cgtaaacagatttacgaagg 120 aggtgtgtga ttgctacggt ttcctatccg gaaactatga agttgtagacgaactccctg 180 atcaaccgac gcttccgaaa actcaaaaca agacttatag tacgctatggaatcgatgaa 240 cgtaaaatac ccggttgagt accttatcga acacctgaac tcttttgagtctccggaagt 300 agccgtcgaa tcccttcgca aggaggggat tatgtgcaaa aaccggggtgatctatacat 360 gttcaaatat caccttggtt gtaagtttga taagatatat caccttgcctgtcgcggggc 420 gattctccgc aaaacggata gtggttggaa agttctgtct tatccctttgacaaattttt 480 caactggggg gaagaactcc agccggaaat cgtaaactat tatcagacgcttcgttacgc 540 gtctcccctg aatgaaaagc gcaaagccgg tttcatgttc aaacttcccatgaaactggt 600 tgaaaagctg gatggtactt gtgtggtttt atattatgat gaagggtggaaaattcacac 660 tcttgggagt attgacgcaa atggatccat tgtcaaaaac ggaatggttaccactcatat 720 ggataaaaca tatcgagaat tgttctggga aacctttgaa aagaaatatccgccttacct 780 tctctatcat ttgaactcct catactgtta catatttgaa atggttcatccggacgcgcg 840 agtggtggtt ccttatgagg agccaaatat cattctgatc ggtgtgcgttcggtggatcc 900 ggagaaggga tatttcgagg tgggtccctc cgaagaagcc gtacgcattttcaacgaaag 960 tggcggaaaa ataaatctta agctaccggc tgttctgtct caagagcaaaactatactct 1020 ttttcgtgcc aatcgccttc aggaactatt tgaggaagtt acaccgcttttcaaaagcct 1080 gagagacggt tatgaggtgg tatatgaagg atttgtagcc gtacaggaaattgccccgcg 1140 tgtttattac cgcacaaaga tcaagcaccc ggtatatctg gagctccaccggattaaaac 1200 tacaatcact cctgagaagc tcgccgatct ttttcttgaa aacaaacttgatgattttgt 1260 acttaccccg gatgaacagg aaaccgtgat gaaactcaaa gaaatttataccgatatgcg 1320 aaatcagctt gagtcatctt ttgatacgat ttataaagag atttccgaacaggtttctcc 1380 ggaagaaaac cccggagagt ttcgcaaaag gttcgctctt cgacttatggattatcatga 1440 taaaagttgg ttttttgccc gccttgacgg cgacgaagag aaaatgcaaaagtcggaaaa 1500 gaagcttcta acggagagaa ttgaaaaggg gttatttaaa taaaaatgataaaaaagcgt 1560 aatcctcttt tctggggaag acgggaactc aatcttcttc agcattttgcccttgaagc 1619 61 1440 DNA Bacteriophage RM378 61 gcttcgtcaa aactcacgtctatagtatct atgtcgtagg gttcgaggtt ggaggcaatc 60 aggttgaaca gttcatcataatcataattc tcgaaaagaa tgttgcgaat accgatccct 120 ctttctggat cgtagggatattcccccggc tcgatgaaaa gcaggagttt tatcttatcg 180 atcaggagtt ttaccgggtcatcaggaaat ctgaaattcg gtgcagtgtc gttcagatag 240 aacatttcat ttttgtttaaataaatcctc gaggaatctt caaataaaga ggggcgttaa 300 tggatgaaaa gactgaggaatatggtcaat cttatcgatc tcaaaaatca gtattatgct 360 tactctttca agtttttcgactcctatcag atcagctggg ataattaccc gcatcttaaa 420 gagttcgtca ttgaaaactatcccggcact tatttttcat gctacgctcc ggggattctg 480 tacaagcttt tcctcaaatggaagcggggt atgatcattg acgactatga ccgacacccg 540 ctccgaaaga agttacttcctcagtacaaa gagcaccgct atgaatacat tgagggaaaa 600 tacggtgtgg ttcctttccccgggtttctg aaatatctga agttccactt tgaggacttg 660 cggtttaaaa tgcgcgatcttggaatcacc gatttcaaat atgcacttgc catttctctt 720 ttttacaacc gggtaatgctcagagatttt ctgaaaaact ttacctgtta ttacattgcc 780 gaatatgaag ctgacgatgtaatcgcacat ctggcgcgtg agattgcacg aagcaatatc 840 gacgtaaaca tcgtctcaacggataaagat tattaccagc tatgggatga agaggatata 900 agagaaaggg tttatatcaattctctttca tgtagtgatg tgaagacacc ccgctacgga 960 tttcttacca ttaaagcacttcttggagac aaaagcgata acattcccaa atctctggaa 1020 aaaggaaaag gcgaaaagtatcttgaaaag aaaggatttg cggaggaaga ttacgataag 1080 gaactattcg agaataatctgaaggtgatc aggtttggag acgaatatct tggagaaagg 1140 gataaaagct ttatagaaaatttttctacg ggggatactc tgtggaactt ttatgaattt 1200 ttttactatg accctttgcatgaacttttc ctcagaaata taagaaagag gagactatga 1260 aagtactcgc atttaccgatgcacctacgt ttcccacggg ggtgggtcat cagcttcaca 1320 acattatcaa ttacgggtttgacgcaaccg atcgctgggt tgtggtgcac ccgccccggt 1380 cgccaagggc tggagagactaaaaacgtcg ttattggaaa cactccagtc aagcttatca 1440 62 1508 DNABacteriophage RM378 62 acttcccaaa tgctatgtgg aggtggatga tagaaagcgtattgttaatg aagaggcggt 60 caagtctttt ctccataagc atgttaccga actgctgaagaattatcagt aacccaaacc 120 taaacccgaa aaatatatgg aaacgattgt aatttcccaaaacaatacga cggagatgac 180 ggaacccccc cagaacattt ccgattcggt taaaagcgggtttatctatc ttatcgaaaa 240 gtctcatttc cttgaaaaga aaaacttcct taaaatcatatcgaacatgg acccccgccg 300 catttccaat ccggaggtgc gcgtggtggc ggagtacatatatgattatt tcaaaagtca 360 tagtaatttc ccttctaaaa gaaatctttg ccatcactttgagtggagcg aagatctgga 420 aggagacccc gccgattatc agcgtatcat tcagtatctcaaatcttctt acattcgatc 480 ctctataaca aaaacgcttt catatcttga gaaggatgacctttccgcgt tgaaagaaat 540 tgtcagagcc attcgggtgg tggaggatag tggggtgtcgctggtggagg aattcgatct 600 tgcaaccagc gagtttaatg aactttttgt taaagaagaacgcattccca ccccctggga 660 gagtgtaaac aaaaatatgg cgggcggtct tggtcggggagagcttggaa tcgttatgct 720 tccttcgggg tggggtaagt catggttcct tgtttcacttggtcttcatg cctttcgaac 780 gggtaagcgc gtgatttatt tcactctgga gcttgaccaaaaatatgtga tgaagcggtt 840 tttaaagatg tttgcacctt attgcaaagg acgcgcttcttcctatcgcg acgtttatca 900 aataatgaaa gagcttatgt tttctcagga taatcttttgragattgttt tctgtaatgc 960 gatggaagat attgagcact atattgcgct gtataaccccgacgttgtgc tgattgacta 1020 tgccgatctt atttatgatg tggaaaccga caaagagaaaaattatctgc ttttgcaaaa 1080 aatttatagg aaacttcgtc tcattgcaaa ggtatataatacagcagtat ggagcgcctc 1140 tcagcttaat cgcggttccc tttcaaagca agccgacgtcgatttcattg agaaatacat 1200 tgccgattca tttgcaaaag ttkttgaaat cgacttcgggatggcgttta ttccggatag 1260 cgagaactca acccccgata ttcacgtcgg attcggtaaaatcttcaaaa accgtatggg 1320 tgcggtaaga aagctggaat atacaattaa ctttgaaaactatacggtag acgttgctgt 1380 taaatgacac aagttaagac aaaagggctt aaagacatcagaataggtag aaaggagggt 1440 aagttcacac atgtaaatac aacaaagaaa ggaaagaataagaaatattt cagggcggaa 1500 catgaacg 1508 63 8 PRT Artificial SequencePeptide 63 Asp Xaa Xaa Ser Leu Tyr Pro Ser 1 5 64 33 DNA ArtificialSequence Nucleic acid 64 cacgagctca tgaagatcac gctaagcgca agc 33 65 33DNA Artificial Sequence Nucleic acid 65 acaggtacct tactcaggta tttttttgaacat 33 66 33 DNA Artificial Sequence Nucleic acid 66 cacgagctcatgaacatcaa caagtatcgt tat 33 67 30 DNA Artificial Sequence Nucleic acid67 acaggtacct tagttttcac tctctacaag 30 68 28 DNA Artificial SequenceNucleic acid 68 gggaattctt atgaacgtaa aatacccg 28 69 28 DNA ArtificialSequence Nucleic acid 69 ggagatctta tttaaataac cccttttc 28 70 30 DNAArtificial Sequence Nucleic acid 70 gggaattctt atgaaaagac tgaggaatat 3071 26 DNA Artificial Sequence Nucleic acid 71 ggagatctca tagtctcctctttctt 26 72 31 DNA Artificial Sequence Nucleic acid 72 gggcaattgttatggaaacg attgtaattt c 31 73 26 DNA Artificial Sequence Nucleic acid 73cgggatcctc atttaacagc aacgtc 26

What is claimed is:
 1. An isolated nucleic acid molecule comprising anucleotide sequence of an open reading frame of the nucleotide sequenceSEQ ID NO:1, wherein the open reading frame is ORF 1218a (locus DAS) .2. An isolated nucleic acid molecule which encodes a polypeptideobtainable from bacteriophage RM 378, or an active derivative orfragment thereof, wherein the polypeptide is a 5′-3′ exonuclease,wherein the bacteriophage RM 378 is deposited in Rhodothermus marinusstrain ITI 378 infected with bacteriophage RM 378 in the DeutscheSammlung Von Mikroorganismen und Zelkulturen GmbH (DSMZ), accessionnumber DSM
 12831. 3. An isolated nucleic acid molecule of claim 2,wherein the polypeptide is a derivative possessing substantial sequenceidentity with the endogenous polypeptide.
 4. An isolated nucleic acidmolecule which encodes a polypeptide that possesses substantial sequenceidentity with an endogenous polypeptide obtainable from bacteriophage RM378, wherein the polypeptide is a 5′-3′ exonuclease.
 5. A DNA constructcomprising an isolated nucleic acid molecule of claim 1, operativelylinked to a regulatory sequence.
 6. A host cell comprising a DNAconstruct of claim
 5. 7. A DNA construct comprising an isolated nucleicacid molecule of claim 2, operatively linked to a regulatory sequence.8. A host cell comprising a DNA construct of claim
 7. 9. A DNA constructcomprising an isolated nucleic acid molecule of claim 4, operativelylinked to a regulatory sequence.
 10. A host cell comprising a DNAconstruct of claim 9.